Use cases for self-modifying code? - self-modifying

On a Von Neumann architecture, program and data are both stored in memory, so a program can modify itself. Is this useful for a programmer? Could you give some examples?

Metamorphism
One (questionable) use case that comes to my mind is metamorphic computer viruses. These are malicious pieces of software that conceal themselves from signature based detection by rewriting their own machine code to an semantically equivalent representation that looks different.
Trampolining
Another (more complex, but also more common) use case is trampolining, a technique based on dynamic code generation to solve certain problems with nested function calls.
JIT compilation
The most common usage of dynamic code generation that I can think of is JIT (just-in-time) compilation. Modern languages like .NET or Java are not compiled into native machine code, but into some kind of intermediate language (called bytecode). This bytecode is then interpreted when the program is executed (by a virtual machine written for the target architecture). At the same time, a background process checks which parts of the code are executed very often. These parts then have a good chance of being dynamically compiled into native machine language for maximum performance. All this happens during the run time of the program!
Security implications
One thing to keep in mind is that the possibility to interpret data as code is useful for exploiting security holes in computer software, which is why the trend in modern hardware and operating systems is to enable and, if possible, even enforce the separation of code and data (also see NX bit and DEP).

I can best answer this by referring you to an answer to a similar (exceptionally well written and answered) question, also on StackOverflow - Homoiconic and "unrestricted" self modifying code + Is lisp really self modifying?. The answer focuses on Lisp, a family languages known for taking "code is data" to the next level, and explores the uses of that in AI.

Related

What does it mean if someone refers to something as BootStrap?

I hear the term "BootStrap" thrown around a lot, but I'm not really sure what it refers to. I know there is a bootstrap CSS, but what exactly does the term mean?
Literally, a bootstrap is a tab on the sides or back of boots that helps you to pull them on. Putting on your shoes or boots is usually the last step of getting dressed; similarly, in programming it's been applied to the initialization or start-up step of a program.
See also the Wikipedia entry for bootstrapping:
Bootstrapping or booting refers to a group of metaphors which refer to a self-sustaining process that proceeds without external help.
[.. in Software Loading] booting is the process of starting a computer, specifically in regards to starting its software. The process involves a chain of stages, in which at each stage a smaller simpler program loads and then executes the larger more complicated program of the next stage. It is in this sense that the computer "pulls itself up by its bootstraps", i.e. it improves itself by its own efforts
[.. in Software Development] bootstrapping can also refer to the development of successively more complex, faster programming environments. The simplest environment will be, perhaps, a very basic text editor (e.g., ed) and an assembler program. Using these tools, one can write a more complex text editor, and a simple compiler for a higher-level language and so on, until one can have a graphical IDE and an extremely high-level programming language.
A shoehorn is another means to help you don footwear but it's idiomatically come to mean cramming something into a tight space.
In computer science Bootstrap (or more commonly "Boot") generally refers to the setup/start/initialization step of a process. It can mean many things depending on the context: starting a physical machine, setting up variables and services for an application to use, or even laying the css groundwork for a website to implement.
Bootstrapping let you create your own most complex design by just minimal configuration, rather than develop it from the scratch.

Abstraction or not?

The other day i stumbled onto a rather old usenet post by Linus Torwalds. It is the infamous "You are full of bull****" post when he defends his choice of using plain C for Git over something more modern.
In particular this post made me think about the enormous amount of abstraction layers that accumulate one over the other where I work. Mine is a Windows .Net environment. I must say that I like C# and the .Net environment, it really makes most things easy.
Now, I come from a very different background made of Unix technologies like C and a plethora or scripting languages; to me, also, OOP is just one, and not always the best, programming paradigm.. I often struggle (in a working kind of way, of course!) with my colleagues (one in particular), because they appear to be of the "any problem can be solved with an additional level of abstraction" church, while I'm more of the "keeping it simple" school. I think that there is a very different mental approach to the problems that maybe comes from the exposure to different cultures.
As a very simple example, for the first project I did here I needed some configuration for an application. I made a 10 rows class to load and parse a txt file to be located in the program's root dir containing colon separated key / value pairs, one per row. It worked.
In the end, to standardize the approach to the configuration problem, we now have a library to be located on every machine running each configured program that calls a service that, at startup, loads up an xml that contains the references to other xmls, one per application, that contain the configurations themselves.
Now, it is extensible and made up of fancy reusable abstractions, providers and all, but I still think that, if we one day really happen to reuse part of it, with the time taken to make it up, we can make the needed code from start or copy / past the old code and modify it.
What are your thoughts about it? Can you point out some interesting reference dealing with the problem?
Thanks
Abstraction makes it easier to construct software and understand how it is put together, but it complicates fully understanding certain issues around performance and security, because the abstraction layers introduce certain kinds of complexity.
Torvalds' position is not absurd, but he is an extremist.
Simple answer: programming languages provide data structures and ways to combine them. Use these directly at first, do not abstract. If you find you have representation invariants to maintain that are at a high risk of being broken due to a large number of usage sites possibly outside your control, then consider abstraction.
To implement this, first provide functions and convert the call sites to use them without hiding the representation. Hide the data representation only when you're satisfied your functional representation is sufficient. Make sure at this time to document the invariant being protected.
An "extreme programming" version of this: do not abstract until you have test cases that break your program. If you think the invariant can be breached, write the case that breaks it first.
Here's a similar question: https://stackoverflow.com/questions/1992279/abstraction-in-todays-languages-excited-or-sad.
I agree with #Steve Emmerson - 'Coders at Work' would give you some excellent perspective on this issue.

lowest level language until asp.net?

it's assembler right? can someone please point out the progression that we've had in programming languages since assembler to the days of asp.net, namely the chronological order of languages?
Here's a wiki timeline of all programming languages.
I would include a FTA table, but the list is very robust and extensive.
And also, the lowest language you ever get to is assembly (aside from straight up issuing machine instructions), regardless of what other language is built on top (including ASP.NET). Other languages are really just abstractions on top of assembly. In fact, ASP.NET gets compiled into IL (Intermediate Language) code, which then get's JITed into assembly. Assembly is as close to the metal as you're going to get.
To be pedantic, "assembler" is not actually a language (any more than "compiler" is;-) -- rather, it's a program that takes a source file in "assembly language" and emits binary machine code. The binary machine code can be said to be lower-level than the assembly language, since the latter allows use of some symbols and often includes a macro processing ability as well.
"Below" binary machine code, there may be other levels, known as "microcode" (but there might not be -- the CPU might be implemented entirely in real hardware, without any microprogramming aspect). That might be relevant only if the system's architecture allowed programmers to alter the microcode, especially by adding to it, etc -- there have been machines that did that, but I don't believe any currently commercialized CPU does. So you probably don't have to care about that (and the by-now-esoteric distinctions between vertical and horizontal microcode, etc, etc;-).
Programming languages are just ways to assemble solutions to computing problems.
The argument is "assembled out of what?"
From that point of view, I'd suggest the following evolutionary curve:
Napier's Bones
Babbage's difference engine
Jacquard (card) looms
(Conceptual) Abstract Turing machines/Post Systems/Church's calculus
Relay Computers (Aiken?)
Vacuum tubes as switching elements (Eniac)
Transistor-based computers
Microprogrammed machines
Integrated Circuits
Large Scale Circuits
with "assembler" being the programming language used to
put together solutions consisting of instructions for
real machines starting with the vacuum tube systems.
(I'm not sure the relay machines actually had assemblers).
Programming langauges are just ways to put together high
level commands that reduce in effect to assembler instructions.
There are two different dimensions to consider here, what I'd call vertical growth (languages build up over time from one generation to the next) and horizontal growth (syntactic improvements and reduction in complexity.)
A good explanation of vertical change is seen here: http://web.sxu.edu/rogers/sys/generations.html
And a nice, yet incomplete, illustration of horizontal change it here: http://oreilly.com/news/graphics/prog_lang_poster.pdf

Is a functional language a good choice for a Flight Simulator? How about Lisp?

I have been doing object-oriented programming for a few years now, and I have not done much functional programming. I have an interest in flight simulators, and am curious about the functional programming aspect of Lisp. Flight simulators or any other real world simulator makes sense to me in an object-oriented paradigm.
Here are my questions:
Is object oriented the best way to represent a real world simulation domain?
I know that Common Lisp has CLOS (OO for lisp), but my question is really about writing a flight simulator in a functional language. So if you were going to write it in Lisp, would you choose to use CLOS or write it in a functional manner?
Does anyone have any thoughts on coding a flight simulator in lisp or any functional language?
UPDATE 11/8/12 - A similar SO question for those interested -> How does functional programming apply to simulations?
It's a common mistake to think of "Lisp" as a functional language. Really it is best thought of as a family of languages, probably, but these days when people say Lisp they usually mean Common Lisp.
Common Lisp allows functional programming, but it isn't a functional language per se. Rather it is a general purpose language. Scheme is a much smaller variant, that is more functional in orientation, and of course there are others.
As for your question is it a good choice? That really depends on your plans. Common Lisp particularly has some real strengths for this sort of thing. It's both interactive and introspective at a level you usually see in so-called scripting languages, making it very quick to develop in. At the same time its compiled and has efficient compilers, so you can expect performance in the same ballpark as other efficient compilers (with a factor of two of c is typical ime). While a large language, it has a much more consistent design than things like c++, and the metaprogramming capabilities can make very clean, easy to understand code for your particular application. If you only look at these aspects
common lisp looks amazing.
However, there are downsides. The community is small, you won't find many people to help if that's what you're looking for. While the built in library is large, you won't find as many 3rd party libraries, so you may end up writing more of it from scratch. Finally, while it's by no means a walled garden, CL doesn't have the kind of smooth integration with foreign libraries that say python does. Which doesn't mean you can't call c code, there are nice tools for this.
By they way, CLOS is about the most powerful OO system I can think of, but it is quite a different approach if you're coming from a mainstream c++/java/c#/etc. OO background (yes, they differ, but beyond single vs. multiple inh. not that much) you may find it a bit strange at first, almost turned inside out.
If you go this route, you are going to have to watch for some issues with performance of the actual rendering pipeline, if you write that yourself with CLOS. The class system has incredible runtime flexibility (i.e. updating class definitions at runtime not via monkey patching etc. but via actually changing the class and updating instances) however you pay some dispatch cost on this.
For what it's worth, I've used CL in the past for research code requiring numerical efficiency, i.e. simulations of a different sort. It works well for me. In that case I wasn't worried about using existing code -- it didn't exist, so I was writing pretty much everything from scratch anyway.
In summary, it could be a fine choice of language for this project, but not the only one. If you don't use a language with both high-level aspects and good performance (like CL has, as does OCaml, and a few others) I would definitely look at the possibility of a two level approach with a language like lua or perhaps python (lots of libs) on top of some c or c++ code doing the heavy lifting.
If you look at the game or simulator industry you find a lot of C++ plus maybe some added scripting component. There can also be tools written in other languages for scenery design or related tasks. But there is only very little Lisp used in that domain. You need to be a good hacker to get the necessary performance out of Lisp and to be able to access or write the low-level code. How do you get this knowhow? Try, fail, learn, try, fail less, learn, ... There is nothing but writing code and experimenting with it. Lisp is really useful for good software engineers or those that have the potential to be a good software engineer.
One of the main obstacles is the garbage collector. Either you have a very simple one (then you have a performance problem with random pauses) or you have a sophisticated one (then you have a problem getting it working right). Only few garbage collectors exist that would be suitable - most Lisp implementations have good GC implementations, but still those are not tuned for real-time or near real-time use. Exceptions do exist. With C++ you can forget the GC, because there usually is none.
The other alternative to automatic memory management with a garbage collector is to use no GC and manage memory 'manually'. This is used by some (even commercial) Lisp applications that need to support some real-time response (for example process control expert systems).
The nearest thing that was developed in that area was the Crash Bandicoot (and also later games) game for the Playstation I (later games were for the Playstation II) from Naughty Dog. Since they have been bought by Sony, they switched to C++ for the Playstation III. Their development environment was written in Allegro Common Lisp and it included a compiler for a Scheme (a Lisp dialect) variant. On the development system the code gets compiled and then downloaded to the Playstation during development. They had their own 3d engine (very impressive, always got excellent reviews from game magazines), incremental level loading, complex behaviour control for lots of different actors, etc. So the Playstation was really executing the Scheme code, but memory management was not done via GC (afaik). They had to develop all the technology on their own - nobody was offering Lisp-based tools - but they could, because their were excellent software developers. Since then I haven't heard of a similar project. Note that this was not just Lisp for scripting - it was Lisp all the way down.
One the Scheme side there is also a new interesting implementation called Ypsilon Scheme. It is developed for a pinball game - this could be the base for other games, too.
On the Common Lisp side, there have been Lisp applications talking to flight simulators and controlling aspects of them. There are some game libraries that are based on SDL. There are interfaces to OpenGL. There is also something like the 'Open Agent Engine'. There are also some 3d graphics applications written in Common Lisp - even some complex ones. But in the area of flight simulation there is very little prior art.
On the topic of CLOS vs. Functional Programming. Probably one would use neither. If you need to squeeze all possible performance out of a system, then CLOS already has some overheads that one might want to avoid.
Take a look at Functional Reactive Programming. There are a number of frameworks for this in Haskell (don't know about other languages), most of which are based around arrows. The basic idea is to represent relationships between time-varying values and events. So for example you would write (in Haskell arrow notation using no particular library):
velocity <- {some expression of airspeed, heading, gravity etc.}
position <- integrate <- velocity
The second line declares the relationship between position and velocity. The <- arrow operators are syntactic sugar for a bunch of library calls that tie everything together.
Then later on you might say something like:
groundLevel <- getGroundLevel <- position
altitude <- getAltitude <- position
crashed <- liftA2 (<) altitude groundLevel
to declare that if your altitude is less than the ground level at your position then you have crashed. Just as with the other variables here, "crashed" is not just a single value, its a time-varying stream of values. That is why the "liftA2" function is used to "lift" the comparison operator from simple values to streams.
IO is not a problem in this paradigm. Inputs are time varying values such as joystick X and Y, while the image on the screen is simply another time varying value. At the very top level your entire simulator is an arrow from the inputs to the outputs. Then you call a "run" function that converts the arrow into an IO action that runs the game.
If you write this in Lisp you will probably find yourself creating a bunch of macros that basically re-invent arrows, so it might be worth just finding out about arrows to start with.
I don't know anything about flight sims, and you haven't listed anything in particular they consist of, so this is mostly a guess about writing a FS in Lisp.
Why not:
Lisp excels at exploratory programming. I think that since FSs been around so long, and there are free and open-source examples, that it would not benefit as much from this type of programming.
Flight sims are mostly (I'm guessing) written in static, natively compiled languages. If you're looking for pure runtime performance, in Lisp this tends to mean type declarations and other not-so-Lispy constructs. If you don't get the performance you want with naive approaches, your optimized-Lisp might end up looking a lot like C, and Lisp isn't as good at C at writing C.
A lot of a FS, I'm guessing, is interfacing to a graphics library like OpenGL, which is written in C. Depending on how your FFI / OpenGL bindings are, this might, again, make your code look like C-in-Lisp. You might not have the big win that Lisp does in, say, a web app (which consists of generating a tree structure of plain text, which Lisp is great at).
Why:
I took a glance at the FlightGear source code, and I see a lot of structural boilerplate -- even a straight port might end up being half the size.
They use strings for keys all over the place (C++ doesn't have symbols). They use XML for semi-human-readable config files (C++ doesn't have a runtime reader). Simply switching to native Lisp constructs here could be big win for minimal effort.
Nothing looks at all complex, even the "AI". It's simply a matter of keeping everything organized, and Lisp will be great at this because it'll be a lot shorter.
But the neat thing about Lisp is that it's multi-paradigm. You can use OO for organizing the "objects", and FP for computation within each object. I say just start writing and see where it takes you.
I would first think of the nature of the simulation.
Some simulations require interaction like a flight simulator. I don't think functional programming may be a good choice for an interactive (read: CPU intensive/response-critical) applicaiton. Of course, if you have access to 8 PS3's wired together with Linux, you'll not care too much about performance.
For simulations like evolutionary/genetic programming where you set it up and let 'er rip, a functioonal lauguage may help model the problem domain better than an OO language. Not that I'm an expert in functional programming but the ease of coding recursion and the idea of lazy evaluation common in functional languages seems to me a good fit for the 'let her rip' sort of sims.
I wouldn't say functional programming lends itself particularly well to flight simulation. In general, functional languages can be very useful for writing scientific simulations, though this is a slightly specialised case. Really, you'd probably be better off with a standard imperative (preferably OOP) language like C++/C#/Java, as they would tend to have the better physics libraries as well as graphics APIs, both of which you would need to use very heavily. Also, the OOP approach might make it easier to represent your environment. Another point to consider is that (as far as I know) the popular flight simulators on the market today are written pretty much entirely in C++.
Essentially, my philosophy is that if there's no particularly good reason that you should need to use functional paradigms, then don't use a functional language (though there's nothing to stop you using functional constructs in OOP/mixed languages). I suspect you're going to have a lot less painful of a development process using the well-tested APIs for C++ and languages more commonly associated with game development (which has many commonalities with flight sim). Now, if you want to add some complex AI to the simulator, Lisp might seem like a rather more obvious choice, though even then I wouldn't at all jump for it. And finally, if you're really keen on using a functional language, I would recommend you go with one of the more general purpose ones like Python or even F# (both mixed imperative-functional languages really), as opposed to Lisp, which could end up getting rather ugly for such a project.
There are a few problems with functional languages, and that is they don't mesh well with state, but they do go well with process. So in a way it could be said they are action oriented. This means you'll be wasting your time simulating a plane, what you want to do is simulate the actions of flying a plane. Once you grim that you can probably get it to work.
Now as side point, haskell wouldn't be good IMHO, because it's too abstract for a "game", this sort of app is all about Input/Output, but Haskell is about avoiding IO, so it'll become a monad nightmare, and you'll be working against the language. Lisp is a better choice, or Lua or Javascript, they are also functional, but not purely functional, so for your case try Lisp. Anyways in any of these languages your graphics will be C or C++.
A serious issue however is there is very little documentation, and less tutorials about Functional languages and "games", of course scientific simulations is academically documented but those papers are quite dense, if you succeed maybe you could write you experiences, for others as it's a rather empty field right now

What are the practical advantages of learning Assembly? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last month.
Improve this question
Most people suggest that learning assembly is essential, its important to know the underlying workings of the computer, and so forth. But what I'm looking for are some practical suggestions that will make the effort of learning Assembly to be worth it.
What are your suggestions? What am I missing out on by not learning Assembly and pointers/memory management in general?
I think the main practical advantage to learning low-level things like assembly language, pointers, and memory management is that when you're writing or reviewing high-level code you're better able to instinctively or subconsciously spot performance issues or other pitfalls.
An average developer, might write a simple loop and think, "This code iterates over a set of integers and writes each to the console."
An expert developer might write the same loop and think, "This code iterates over a set of integers, and has to box each element to call the ToString method and ToString has to format the string in base 10 which is somewhat non-trivial, and then both the boxed integer and the formatted string will soon be eligible for garbage collection as no references will remain, and the first time this method runs, it will need to be JIT'ed..." and so on.
9 times out of 10, it may not matter. But that 1 time out of 10, the expert developer is likely to notice a problem in code that the average developer would never think to consider.
Pointers/memory management are more general than assembly language. You need to understand them for C and C++ as well, which you might need if you have to maintain code written in C.
For assembly language, it is sometimes useful to read the assembler code that the C compiler generates, to find out whether it generates correct and efficient code.
You need to learn to read assembly so you can figure out what goes wrong when a complex statement bombs out. The CPU debug window shouldn't be a mysterious place.
This is sort of one of those questions that will always be asked: "Why should I know anything." etc. Well, perhaps you could get a job doing something besides building the next generic CRUD application or something like that. If you want to do any sort of system development, having a working knowledge of assembly is very helpful, if not vital. As far as what you're "missing out" perhaps you are missing out on actually knowing how computers work. Some people think this is desirable. Some people don't. Some people build processors. Some people dig ditches. It's all a matter of personal preference :)
I think it's great to learn new languages. It opens my mind. Some languages are more mind-opening than others. I'd say assembler is one of those. It forces you to think about stuff like the call stack and instruction pointer. And it'll make you appreciate higher level languages even more. Another fun language to learn is PostScript.
I don't think you need to learn assembly for anything practical. However, it will ensure that you understand the real roots of what you are doing as a developer. In essence, assembly programming is a discipline for learning chip logic and architecture. I haven't programmed assembly in over two decades but it still informs the kinds of choices I make when programming C#.
But what I'm looking for are some practical suggestions that will make the effort of learning Assembly to be worth it.
Learn what assembly is.
Really learn how to read (and understand) small fragments of it: how to walk/step through it in your mind.
Perhaps too, step through some of it with a debugger (including seeing memory and registers being changed).
Ideally, find some annotated assembly.
But, don't bother to learn how to write assembly: instead, learning to write C or C++ is probably 'low' enough for most practical purposes.
Well, on a practical level I did a class in 6502 assembler when I was first learning to code the early 80s. I also did some 8088 assembler. It's been of occasional use of the years since but I can't say it's ever really got my out of a hole on more than one or two occasions in 25 years. Groking C at a pretty fundamental level is of far more use. YMMV and it's certainly helpful as background, but as a direct practical benefit? Marginal really.
Perversely though one thing that has proved useful is at an even lower level. I did a class on chip design (NAND gates and the like) and as part of that was taught formal Boolean logic at some depth. That's been massively useful ever since - it's surprising the number of coders who don't really know what they are doing with ands, ors and nots :-)
Pointers and memory management are really a different question than assembly. If you want to do C/C++, then you need to learn pointers and memory management, because those are part of the language. But, even if you plan to use nothing but (say) Java all your life, you should learn something about memory management to keep from writing a memory leak despite the GC, and pointers are just the difference between atomic types and object references. You need the concepts or you'll write programs that don't work!
Practical reasons for learning assembly: debugging and optimization. Even if you don't write any assembly, one of these days you may need to optimize C/C++ code for performance. In that case, you'll need to be able to read the assembly for your inner loop, even if you never need to write another line of it.
Ultimately, I think your distinction between "knowing the underlying workings of your computer" and "practical suggestions that will make the effort of learning assembly worth it" is a false one. Ignorance does not pay. Learning how your computer works is a practical suggestion worth the effort!
I have a prophecy: someday soon, your program will run far too slowly to be practical, and crash intermittently with an out-of-memory exception. On that day, the sheer screaming anxiety of not knowing what the hell is going on or where to start looking in order to fix it will refund your karma debt, with interest...
These days many assembly languages are actually fairly high level.
And it's always been true that if you learn 'C', that's close enough to assembly to get most of the learning benefits.
edit: thinking about this a bit more, in Knuth's books he describes an idealised assembly language. You won't go far wrong learning that, and reading those books.
Another practical reason I can think of is reverse engineering application code to modify it for educational purposes ONLY, since this is widely used by crackers to bypass shareware application protections like time-limit or serial numbers.
An application like win32Dasm can convert executables into assembly code that can later be modified with a Hex editor like hiew. You can learn quite a lot about the flow of the program.
I think learning about computer architecture, in conjunction of assembly, would open your mind quite a bit.
It would help explain lots of performance issues - e.g. parser's slow because there's lots of branches, and pipeline gets flushed very easily, branch predictor cannot compensate for everything.
Also, different architectures have their quirks. Someone talked about an assembly trick to swap 2 registers in place, involving xor's. It works, and it would run great for in-order execution core (most recent example would be the Intel Atom, and the Via C7 in netbooks), but not so great in out-of-order cores.
Knowing that may help you to detect poorly compiled code by inspecting it in assembly, and possibly be able to write code in higher-level language to sidestep the imperfection of compiler optimizers. I'm not trying to diss them, but they just can't be perfectly in tuned.
The biggest practical advantage to learning Assembly is performance. You can optimize to near perfection when its required.

Resources