Can I execute untrusted Common Lisp code in a restricted environment? - common-lisp

Supposed I wanted to take advantage of Common Lisp's ability to read and execute Common Lisp code so that my program can execute external code written in Lisp, but I don't trust that code so I don't want it to have access the full power of Common Lisp. Is it possible for me to restricts its environment so that it can only see the packages/symbols to which I explicitly give it access, effectively creating a DSL?

To read the code, start by disabling *read-eval* (that stops people injecting execution during parsing, using something like #.(do-evil-stuff). You probably want to do the reading using a custom read-table that disables most (if not all) read-macros. You probably want to do the reading with a custom, one-off, package, importing only symbols you allow.
Once you've read the user-provided code, you still need to validate that there's no unexpected function/macro references in the code. If you have used a custom package, you should be able to confirm that each symbol falls in either of the two classes "belongs to the custom one-off package" (this is user-supplied stuff) or "explicitly allowed from elsewhere" (you would need this list to construct the custom package).
Once that's been done, you can then evaluate it.
However, doing this correctly would take a fair bit of care and you really should have someone else have a look at the code and actively try to break out of the sandbox.

Take a look at the section 'Reader security' in chapter 4 of Let over lambda which discusses this topic in some depth. In particular, you probably want to set *read-eval* to nil. To address your question regarding restricting access to the environment, this is generally difficult in Common Lisp, as it is designed to allow access to most pieces of the system in the first place. Maybe you can use elaborate the ideas of Let over lambda in the direction of white listing symbols (in comparison to the blacklisting of macro characters in the linked chapter). I don't think there are any ready-made solutions.

Related

What draws the line between reflective programming & non-reflective programming like simple softcoding?

Not sure if this is the right place to bring up this kind of discussion but, Im reading https://en.wikipedia.org/wiki/Reflective_programming and feel I need a bit further clarification for where the line between "reflective" & non-reflective programming really goes. Theres a series of examples of reflective v non-reflective code towards the end of the wikipedia page where all the "reflective" examples seem to access data with string identifiers - but what would actually differentiate this from say putting a bunch of objects in a collection/array of some sort and accessing them by an index - say compared to accessing them by an array of string identifiers that you can use to fetch the desired object?
In some languages you can clearly see the difference & benefit, like in Python & JS they have the eval method that lets them insert all sorts of code at runtime that can be pretty much endlessly complex and completely change the code flow of an application - and no longer limited to accessing mere special type objects. But in the examples listed on the wiki page you can also find examples where the "reflection" seems limited only to accessing specially declared objects by there name (at which point Im questioning if you can really argue that the program really can be considered to be "modifying" itself at all, at least on a high level in conceptual point of view).
Does the way that the underlying machinery producd by the compiler (or the way that the interpreter reads your code) affect whats considered to be reflective?
Is the ability of redefining the contents of existing objects or declaring new objects without a "base class"/preexisting structure created at compile time that differentiates reflective & non-reflective code? If so, how would this play with the examples at the wikipedia page that doesnt seem to showcase this ability?
Can the meaning of "reflective programming" vary slightly depending on the scenario?
Any thoughts appreciated <3

Common Lisp reader: customizing intern behavior

I would like to intercept the behavior of read to give some control over the interning of symbols. I might, for example, wish for read to throw an error if a previously uninterned symbol shows up in the input stream. Or perhaps I want to limit the packages in which new symbols can be interned.
Is there a way to hook the interning process without rewriting the reader from scratch?
I am ok with alternate reader implementations. Using read itself is not a must.
You can't do this with the reader defined by the standard without jumping through huge hoops: you'd have to implement the process of accumulating and parsing tokens (including all the number parsing stuff) and then provide suitable ways of intervening. The standard tells you enough that you should be able to do that, but it's a lot of work: I suspect that most of any reader implementation is that stuff.
Of course specific implementations might provide convenient points at which you can intervene.
The other approach would be to use a portable, extensible reader. There is at least one thing which may be such a thing: Eclector, and there may well be others. I don't know anything about it, unfortunately.

Why we need to compile the program of progress 4GL?

I would like to know Why we need to compile the program of progress 4GL? Really what is happening behind there? Why we are getting .r file after compiled the program? When we check the syntax if its correct then we will get one message box 'Syntax is correct' how its finding the errors and showing the messages.Any explanations welcome and appreciated.
Benefits of compiled r-code include:
Syntax checking
Faster execution (r-code executes faster)
Security (r-code is not "human readable" and tampering with it will likely be noticed)
Licensing (r-code runtime licenses are much less expensive)
For "how its finding the errors and showing the messages" -- at a high level it is like any compiler. It evaluates the provided source against a syntax tree and lets you know when you violate the rules. Compiler design and construction is a fairly advanced topic that probably isn't going to fit into a simple SO question -- but if you had something more specific that could stand on its own as a question someone might be able to help.
The short answer is that when you compile, you're translating your program to a language the machine understands. You're asking two different questions here, so let me give you a simple answer to the first: you don't NEED to compile if you're the only one using the program, for example. But in order to have your program optimized (since it's already at the machine language level) and guarantee no one is messing with your logic, we compile the code and usually don't allow regular users to access the source code.
The second question, how does the syntax checker work, I believe it would be better for you to Google and choose some articles to read about compilers. They're complex, but in a nutshell what they do is take what Progress expects as full, operational commands, and compare to what you do. For example, if you do a
Find first customer where customer.active = yes no-error.
Progress will check if customer is a table, if customer.active is a field in that table, if it's the logical type, since you are filtering if it is yes, and if your whole conditions can be translated to one single true or false Boolean value. It goes on to check if you specified a lock (and default to shared if you haven't, like in my example, which is a no-no, by the way), what happens if there are multiple records (since I said first, then get just the first one) and finally what happens if it fails. If you check the find statement, there are more options to customize it, and the compiler will simply compare your use of the statement to what Progress can have for it. And collect all errors if it can't. That's why sometimes compilers will give you generic messages. Since they don't know what you're trying to do, all they can do is tell you what's basically wrong with what you wrote.
Hope this helps you understand.

No Global Contract available for procedure / function

I've got a procedure within a SPARK module that calls the standard Ada-Text_IO.Put_Line.
During proving I get the following warning warning: no Global contract available for "Put_Line".
I do already know how to add the respective data dependency contract to procedures and functions written by myself but how do I add them to a procedures / functions written by others where I can't edit the source files?
I looked through sections 5.2 and 7.4 of the Adacore SPARK 2014 user's guide but didn't found an example with a solution to my problem.
This means that the analyzer cannot "see" whether global variables might be affected when this function is called. It therefore assumes this call is not modifying anything (otherwise all other proofs could be refuted immediately). This is likely a valid assumption for your specific example, but it might not be valid on an embedded system, where a custom implementation of Put_Line might do anything.
There are two ways to convey the missing information:
verifier can examine the source code of the function. Then it can try to generate global contracts itself.
global contracts are specified explicitly, see RM 6.1.4 (http://docs.adacore.com/spark2014-docs/html/lrm/subprograms.html#global-aspects)
In this case, the procedure you are calling is part of the run-time system (RTS), and therefore the source is not visible, and you probably cannot/should not change it.
What to do in practice?
Suppressing warnings is almost never a good idea, especially not when you are working on something safety-critical. Usually the code has to be changed until the warning goes away, or some justification process has to start.
If you are serious about the analysis results, I recommend to not use such subprograms. If you really need output there, either write your own procedure that replaces the RTS subprogram, or ensure that the subprogram really has no side effects. This is further backed up by what Frédéric has linked: Even if the callee has no side effects, you don't know whether it raises an exception for specific inputs (e.g., very long strings).
If you are not so serious about the results, then you can consider this specific one as a warning that you could live with.
Wrapper packages for use in development of SPARK applications may be found here:
https://github.com/joakim-strandberg/aida_2012
I think you just can't add Spark contracts on code you don't own, especially code from the Ada standard.
About Text_Io, I found something that may be valuable to you in the reference manual.
EDIT
Another solution compared to what Martin said, according to "Building high integrity applications with Spark" book, is to create a wrapper package.
As Spark requires you to deal with Spark packages but allows you to depend on a Spark spec with an Ada body, the solution is to build a Spark package wrapping your Ada.Text_io calls.
It might be tedious as you will have to wrap possible exceptions, possibly define specific types and so on but this way, you'll be able to discharge VCs on your full Spark package.

Ada: pragma Pure / Remote_Types and system types

I'm writing an Ada application that needs to be distributed, and I'm trying to use the DSA to do it, but I'm finding big limitations in what is "allowed" to be "withed" and what isn't.
I won't post sourcecode, since it's quite complex and this is a generic question anyway, I just wanted some pointers on what I'm not understanding correctly, so please bear with me and correct me if I'm wrong.
So my problem is this: I want to mark a procedure with the pragma Remote_Call_Interface so it can be called remotely. However as soon as I add the pragma compilation breaks due to the fact that the procedure is including other packages in my project that are not categorized as either Pure or Remote_Types.
So I try to mark the packages I need as either Pure or Remote_Types (dpeending wether they have state or not) but this in turn breaks compilation even further, since it turns out that you can't use even basic system types in a Pure/Remote_Types package, for example: you can't use Vectors, you can't use Unbounded_Strings, you can't use Maps, etc... the whole program falls to pieces since I can't use the data structures I used to build it anymore!
Is there a way around this? Or if I want to distribute my application I must strictly limit myself to the most basic types like Integers and booleans and little else?? I don't understand if I'm hitting against a limitation of the language or if I'm just doing it incorrectly (unfortunately the tutorials I found on DSA are all very vague, incidentally if anyone has some good ones feel free to link them!)
EDIT: after ajb's answer let me specify what is annoying me in particular: in the package I want to mark with pragma Remote_Call_Interface I'm trying to "with" some packages that are not pure/remote_types, however it only uses the types in those packages locally, it does not contain any procedures that accept such types as parameters, nor functions that return such types. This is what bothers me: since those types would not have to "travel" over the network, why can't I with them? I'm only using them locally... I don't understand this, and that is why I was trying to make those types Pure/Remote_Types, but now that I've read ajb's explanation (ie: Remote_Types is used so that objects of those types can travel over the network) I'm even more confused about why I can't use them if I only use them locally.
I'm not an expert on Ada distributed programming, but here's what I do know (or think I know):
The Annotated Ada Reference Manual, Section E.2.3 says, "The restrictions governing a remote call interface library unit are intended to ensure that the values of the actual parameters in a remote call can be meaningfully sent between two active partitions." For example, if a record type has a field that's an access type, you can't send it from one partition to another blindly, because the called partition won't be able to access the memory that the pointer points to. (Unbounded_String, Map, and Vector are implemented using access types as part of the internals.) All types used as parameters or return types must support "external streaming", meaning there has to be a way for the type to be converted to and from a stream of bytes so that the parameter value can be transmitted over a socket. If you have a record with an access type, but you provide 'Read and 'Write attributes so that the type can be written to and read from a byte stream without any actual pointers being transmitted, then you can put your record type in a Remote_Types package.
I'm not sure exactly what your problem is: are there certain types you want to pass as a parameter to a remote call but can't; or are there types that you want to use only in the rest of your application, but are getting in the way?
If it's the second one, then I think the solution is to restructure your packages so that all the "remote types" are separate from the non-remote types.
However, if you're really looking to pass an Unbounded_String, Map, or Vector from one partition to another in a remote call, it's trickier. Unbounded_String really should support external streaming, and there was a proposal to make Unbounded_String a Remote_Types package (see AI05-0204), but it wasn't acted on--I don't know why. Map and Vector would be bigger problems, though, since they are generic packages that have to work on any type, including those that don't support external streaming. In any case, those types aren't set up to be automatically converted to or from bytes to be passed over a socket.
But I think you could make it work like this:
private with Ada.Strings.Unbounded;
package Remote_Types_Package is
pragma Remote_Types;
type My_Unbounded_String is private;
private
type My_Unbounded_String is record
S : Ada.Strings.Unbounded.Unbounded_String;
end record;
end Remote_Types_Package;
The Unbounded_String package must be withed with private with; see E.2.2(6). You'll need to provide a function to create the My_Unbounded_String, and you'll need to provide stream read and write routines for My_Unbounded_String, and define 'Read and 'Write for the type. You should be able to write the Read and Write attributes by using the Read and Write attributes for the Unbounded_String. Something similar should be doable if you want to use a Vector as a remote call parameter, although you may have to do more work to marshal/unmarshal the type yourself.
Once again, I have not tried this, and it's possible there are some hitches in this solution.
EDIT: Since it now looks like the question is the simpler one--i.e. you have some types that are not going to be passed between partitions getting in the way--the solution should be simpler. Any types that you define that are going to be communicated between partitions need to be in a Remote_Types package, say P1. Other types should be in a different package, say P2 (or multiple packages). If types in P1 depend on types in P2, you can still get this to work by having P1 say private with P2;, and making sure you have the marshalling and unmarshalling procedures you need. If you run into difficulties, I'd encourage you to ask a new question here.
I don't know why the language required all such types to be quarantined in a Remote_Types package, instead of just saying that any type used in a Remote_Call_Interface package has to have only parts that can be streamed. There may have been some implementation issues. Any code that exists for a Remote_Types package has to be in programs in both partitions, perhaps, and this may have been an attempt to limit the type of code that would have to be linked into multiple partitions. But I'm just guessing.

Resources