How to systematically populate a whitelist for a sandboxing program? - reflection

On pp. 260-263 of Programming in Lua (4th ed.), the author discusses how to implement "sandboxing" (i.e. the running of untrusted code) in Lua.
When it comes to imposing limiting the functions that untrusted code can run, he recommends a "whitelist approach":
We should never think in terms of what functions to remove, but what functions to add.
This question is about tools and techniques for putting this suggestion into practice. (I expect there will be confusion on this point I want to emphasize it upfront.)
The author gives the following code as an illustration of a sandbox program based on a whitelist of allowed functions. (I have added or moved around some comments, and removed some blank lines, but I've copied the executable content verbatim from the book).
-- From p. 263 of *Programming in Lua* (4th ed.)
-- Listing 25.6. Using hooks to bar calls to unauthorized functions
local debug = require "debug"
local steplimit = 1000 -- maximum "steps" that can be performed
local count = 0 -- counter for steps
local validfunc = { -- set of authorized functions
[string.upper] = true,
[string.lower] = true,
... -- other authorized functions
}
local function hook (event)
if event == "call" then
local info = debug.getinfo(2, "fn")
if not validfunc[info.func] then
error("calling bad function: " .. (info.name or "?"))
end
end
count = count + 1
if count > steplimit then
error("script uses too much CPU")
end
end
local f = assert(loadfile(arg[1], "t", {})) -- load chunk
debug.sethook(hook, "", 100) -- set hook
f() -- run chunk
Right off the bat I am puzzled by this code, since the hook tests for event type (if event == "call" then...), and yet, when the hook is set, only count events are requested (debug.sethook(hook, "", 100)). Therefore, the whole song-and-dance with validfunc is for naught.
Maybe it is a typo. So I tried experimenting with this code, but I found it very difficult to put the whitelist technique in practice. The example below is a very simplified illustration of the type of problems I ran into.
First, here is a slightly modified version of the author's code.
#!/usr/bin/env lua5.3
-- Filename: sandbox
-- ----------------------------------------------------------------------------
local debug = require "debug"
local steplimit = 1000 -- maximum "steps" that can be performed
local count = 0 -- counter for steps
local validfunc = { -- set of authorized functions
[string.upper] = true,
[string.lower] = true,
[io.stdout.write] = true,
-- ... -- other authorized functions
}
local function hook (event)
if event == "call" then
local info = debug.getinfo(2, "fnS")
if not validfunc[info.func] then
error(string.format("calling bad function (%s:%d): %s",
info.short_src, info.linedefined, (info.name or "?")))
end
end
count = count + 1
if count > steplimit then
error("script uses too much CPU")
end
end
local f = assert(loadfile(arg[1], "t", {})) -- load chunk
validfunc[f] = true
debug.sethook(hook, "c", 100) -- set hook
f() -- run chunk
The most significant differences in the second snippet relative to the first one are:
the call to debug.sethook has "c" as mask;
the f function for the loaded chunk gets added to the validfunc whitelist;
io.stdout.write is added to the validfunc whitelist;
When I use this sandbox program to run the one-line script shown below:
# Filename: helloworld.lua
io.stdout:write("Hello, World!\n")
...I get the following error:
% ./sandbox helloworld.lua
lua5.3: ./sandbox:20: calling bad function ([C]:-1): __index
stack traceback:
[C]: in function 'error'
./sandbox:20: in function <./sandbox:16>
[C]: in metamethod '__index'
helloworld.lua:3: in local 'f'
./sandbox:34: in main chunk
[C]: in ?
I tried to fix this by adding the following to validfunc:
[getmetatable(io.stdout).__index] = true,
...but I still get pretty much the same error. I could go on guessing and trying more things to add, but this is what I would like to avoid.
I have two related questions:
What can I add to validfunc so that sandbox will run helloworld (as is) to completion?
More importantly, what is a systematic way to find determine what to add to a whitelist table?
Part (2) is the heart of this post. I am looking for tools/techniques that remove the guesswork from the problem of populating a whitelist table.
(I know that I can get helloworld to work if I replace io.stdout:write with print, register print in sandbox's validfunc, and pass {print = print} as the last argument to loadfile, but doing this does not answer the general question of how to systematically determine what needs to be added to the whitelist to allow some specific code to work in the sandbox.)
EDIT: Ask #DarkWiiPlayer pointed out, the calling bad function error is being triggered by the calling of an unregistered function (__index?), which happened as part of the response to an earlier attempt to index a nil value error. So, this post's questions are all about systematically determining what to add to validfunc to allow Lua to emit the attempt to index a nil value error normally.
I should add that the question of which function's call triggered the hook's execution responsible for the calling bad function error message is at the moment completely unclear. This error message blames the error on __index, but I suspect that this may be a red herring, possibly due to a bug in Lua.
Why suspect a bug in Lua? If I change the error call in sandbox slightly to
error(string.format("calling bad function (%s:%d): %s (%s)",
info.short_src, info.linedefined, (info.name or "?"),
info.func))
...then the error message looks like this:
lua5.3: ./sandbox:20: calling bad function ([C]:-1): __index (function: 0x55b391b79ef0)
stack traceback:
[C]: in function 'error'
./sandbox:20: in function <./sandbox:16>
[C]: in metamethod '__index'
helloworld.lua:3: in local 'f'
./sandbox:34: in main chunk
[C]: in ?
Nothing surprising there, but if now I change helloworld.lua to
# Filename: helloworld.lua
nonexistent()
io.stdout:write("Hello, World!\n")
...and run it under sandbox, the error message becomes
lua5.3: ./sandbox:20: calling bad function ([C]:-1): nonexistent (function: 0x556a161cdef0)
stack traceback:
[C]: in function 'error'
./sandbox:20: in function <./sandbox:16>
[C]: in global 'nonexistent'
helloworld.lua:3: in local 'f'
./sandbox:34: in main chunk
[C]: in ?
From this error message, one may conclude that nonexistent is a real function; after all, it's sitting right there at 0x556a161cdef0! But we know that nonexistent lives up to its name: it doesn't exist!
The whiff of a bug is definitely in the air. It could be that the function that is triggering the hook should really be excluded from those that trigger such "c"-masked hooks? Be that as it may, it appears that, in this particular situation, the call to debug.info is returning inconsistent information (since the name of the function [e.g. nonexistent] clearly does not correspond at all to the actual function object [e.g. function: 0x556a161cdef0] that is supposedly triggering the hook).

(Final answer at the bottom, feel free to skip until the <hr> line)
I'll explain my debugging step by step.
This is a really weird phenomenon. After some testing, I've managed to narrow it down a bit:
Since you pass {} to load, the function runs with an empty environment, so io is, in fact, nil (and io.stdout would error anyway)
The error happens directly when attempting to index io (which is a nil value)
The functio __index is a C function (see error message)
My first intuition was that __index was called somewhere internally. Thus, to find out what it does, I decided to look at its locals in hopes of guessing what it does.
A quick helper function I threw together:
local function locals(f)
return function(f, n)
local name, value = debug.getlocal(f+1, n)
if name then
return n+1, name, value
end
end, f, 1
end
Insert that right before the line where the error is raised:
for idx, name, value in locals(2) do
print(name, value)
end
error(string.format("calling bad function (%s:%d): %s", info.short_src, info.linedefined, (info.name or "?")))
This led to an interesting result:
(*temporary) stdin:43: attempt to index a nil value (global 'io')
(*temporary) table: 0x563cef2fd170
lua: stdin:29: calling bad function ([C]:-1): __index
stack traceback:
[C]: in function 'error'
stdin:29: in function <stdin:21>
[C]: in metamethod '__index'
stdin:43: in function 'f'
stdin:49: in main chunk
[C]: in ?
shell returned 1
Why is there a temporary string value with a completely different error message?
By the way, this error makes total sense; io does not exist because of the empty environment, so indexing it should obviously raise just that error.
It's honestly a very interesting error, but I'll leave it at this, as you're learning the language and this hint might be enough for you to figure it out on your own. It's also a very nice chance to actually use (and get to know) the debug module in a more practical context.
Actual Solution
After some time has now passed, I came back to add a proper solution to this problem, but I really already did just that. The weird error reporting is just Lua being weird. The real error is the empty environment that's set when loading the chunk, as I mentioned a few paragraphs above.
From the manual:
load (chunk [, chunkname [, mode [, env]]])
Loads a chunk.
[...]
If the resulting function has upvalues, the first upvalue is set to the value of env, if that parameter is given, or to the value of the global environment. Other upvalues are initialized with nil. (When you load a main chunk, the resulting function will always have exactly one upvalue, the _ENV variable (see §2.2). However, when you load a binary chunk created from a function (see string.dump), the resulting function can have an arbitrary number of upvalues.) All upvalues are fresh, that is, they are not shared with any other function.
[...]
Now, in a "main chunk", i.e. one loaded from a text Lua file, the first (and only) upvalue is always the environment of the chunk, so where it will look for "globals" (this is slightly different in Lua 5.1). Since an empty table is passed in, the chunk has no access to any of the global variables like string or io.
Therefore, when the function f() tries to index io, Lua throws an error "attempt to index a nil value", because io is nil. For whatever reason Lua then makes some internal function calls that end up triggering the blacklist, causing a new error that shadows the previous one; this makes debugging this error extremely inconvenient and almost impossible without using the debug library to get additional information about the call stack.
I ultimately only realized this myself after I noticed the original error message while looking at the locals of the function that made the blocked call.
I hope this solves the problem :)

Related

Dialyzer does not catch errors on returned functions

Background
While playing around with dialyzer, typespecs and currying, I was able to create an example of a false positive in dialyzer.
For the purposes of this MWE, I am using diallyxir (versions included) because it makes my life easier. The author of dialyxir confirmed this was not a problem on their side, so that possibility is excluded for now.
Environment
$ elixir -v
Erlang/OTP 24 [erts-12.2.1] [source] [64-bit] [smp:12:12] [ds:12:12:10] [async-threads:1] [jit]
Elixir 1.13.2 (compiled with Erlang/OTP 24)
Which version of Dialyxir are you using? (cat mix.lock | grep dialyxir):
"dialyxir": {:hex, :dialyxir, "1.1.0", "c5aab0d6e71e5522e77beff7ba9e08f8e02bad90dfbeffae60eaf0cb47e29488", [:mix], [{:erlex, ">= 0.2.6", [hex: :erlex, repo: "hexpm", optional: false]}], "hexpm", "07ea8e49c45f15264ebe6d5b93799d4dd56a44036cf42d0ad9c960bc266c0b9a"},
"erlex": {:hex, :erlex, "0.2.6", "c7987d15e899c7a2f34f5420d2a2ea0d659682c06ac607572df55a43753aa12e", [:mix], [], "hexpm", "2ed2e25711feb44d52b17d2780eabf998452f6efda104877a3881c2f8c0c0c75"},
Current behavior
Given the following code sample:
defmodule PracticingCurrying do
#spec greater_than(integer()) :: (integer() -> String.t())
def greater_than(min) do
fn number -> number > min end
end
end
Which clearly has a wrong typing, I get a success message:
$ mix dialyzer
Compiling 1 file (.ex)
Generated grokking_fp app
Finding suitable PLTs
Checking PLT...
[:compiler, :currying, :elixir, :gradient, :gradualizer, :kernel, :logger, :stdlib, :syntax_tools]
Looking up modules in dialyxir_erlang-24.2.1_elixir-1.13.2_deps-dev.plt
Finding applications for dialyxir_erlang-24.2.1_elixir-1.13.2_deps-dev.plt
Finding modules for dialyxir_erlang-24.2.1_elixir-1.13.2_deps-dev.plt
Checking 518 modules in dialyxir_erlang-24.2.1_elixir-1.13.2_deps-dev.plt
Adding 44 modules to dialyxir_erlang-24.2.1_elixir-1.13.2_deps-dev.plt
done in 0m24.18s
No :ignore_warnings opt specified in mix.exs and default does not exist.
Starting Dialyzer
[
check_plt: false,
init_plt: '/home/user/Workplace/fl4m3/grokking_fp/_build/dev/dialyxir_erlang-24.2.1_elixir-1.13.2_deps-dev.plt',
files: ['/home/user/Workplace/fl4m3/grokking_fp/_build/dev/lib/grokking_fp/ebin/Elixir.ImmutableValues.beam',
'/home/user/Workplace/fl4m3/grokking_fp/_build/dev/lib/grokking_fp/ebin/Elixir.PracticingCurrying.beam',
'/home/user/Workplace/fl4m3/grokking_fp/_build/dev/lib/grokking_fp/ebin/Elixir.TipCalculator.beam'],
warnings: [:unknown]
]
Total errors: 0, Skipped: 0, Unnecessary Skips: 0
done in 0m1.02s
done (passed successfully)
Expected behavior
I expected dialyzer to tell me the correct spec is #spec greater_than(integer()) :: (integer() -> bool()).
As a side note (and comparison, if you will) gradient does pick up the error.
I know that comparing these tools is like comparing oranges and apples, but I think it is still worth mentioning.
Questions
Is dialyzer not intended to catch this type of error?
If it should catch the error, what can possibly be failing? (is it my example that is incorrect, or something inside dialyzer?)
I personally find it hard to believe this could be a bug in Dialyzer, the tool has been used rather extensively by a lot of people for me to be the first to discover this error. However, I cannot explain what is happening.
Help is appreciated.
Dialyzer is pretty optimistic in its analysis and ignores some categories of errors.
This article provides some advanced explanations about its approach and limitations.
In the particular case of anonymous functions, dialyzer seems to perform a very minimal check
when they are being declared: it will ignore both the types of its arguments and return type, e.g.
the following doesn't lead any error even if is clearly wrong:
# no error
#spec add(integer()) :: (String.t() -> String.t())
def add(x) do
fn y -> x + y end
end
It will however point out a mismatch in arity, e.g.
# invalid_contract
# The #spec for the function does not match the success typing of the function.
#spec add2(integer()) :: (integer(), integer() -> integer())
def add2(x) do
fn y -> x + y end
end
Dialyzer might be able to detect a type conflict when trying to use the anonymous function,
but this isn't guaranteed (see article above), and the error message might not be helpful:
# Function main/0 has no local return.
def main do
positive? = greater_than(0)
positive?.(2)
end
We don't know what is the problem exactly, not even the line causing the error. But at least we know there is one and can debug it.
In the following example, the error is a bit more informative (using :lists.map/2 instead of Enum.map/2 because
dialyzer doesn't understand the enumerable protocol):
# Function main2/0 has no local return.
def main2 do
positive? = greater_than(0)
# The function call will not succeed.
# :lists.map(_positive? :: (integer() -> none()), [-2 | 0 | 1, ...])
# will never return since the success typing arguments are
# ((_ -> any()), [any()])
:lists.map(positive?, [1, 0, -2])
end
This tells us that dialyzer inferred the return type of greater_than/1 to be (integer() -> none()).
none is described in the article above as:
This is a special type that means that no term or type is valid.
Usually, when Dialyzer boils down the possible return values of a function to none(), it means the function should crash.
It is synonymous with "this stuff won't work."
So dialyzer knows that this function cannot be called successfully, but doesn't consider it to be a type clash until actually called, so it will allow the declaration (in the same way you can perfectly create a function that just raises).
Disclaimer: I couldn't find an official explanation regarding how dialyzer handles anonymous
functions in detail, so the explanations above are based on my observations and interpretation

Julia print statement not working in certain cases

I've written a prime-generating function generatePrimes (full code here) that takes input bound::Int64 and returns a Vector{Int64} of all primes up to bound. After the function definition, I have the following code:
println("Generating primes...")
println("Last prime: ", generatePrimes(10^7)[end])
println("Primes generated.")
which prints, unexpectedly,
Generating primes...
9999991
Primes generated.
This output misses the "Last prime: " segment of the second print statement. The output does work as expected for smaller inputs; any input at least up to 10^6, but somehow fails for 10^7. I've tried several workarounds for this (e.g. assigning the returned value or converting it to a string before calling it in a print statement, combining the print statements, et cetera) and discovered some other weird behaviour: if the "Last prime", is removed from the second print statement, for input 10^7, the last prime doesn't print at all and all I get is a blank line between the first and third print statements. These issues are probably related, and I can't seem to find anything online about why some print statements wouldn't work in Julia.
Thanks so much for any clarification!
Edit: Per DNF's suggestion, following are some reductions to this issue:
Removing the first and last print statements doesn't change anything -- a blank line is always printed in the case I outlined and each of the cases below.
println(generatePrimes(10^7)[end]) # output: empty line
Calling the function and storing the last index in a variable before calling println doesn't change anything either; the cases below work exactly the same either way.
lastPrime::Int = generatePrimes(10^7)[end]
println(lastPrime) # output: empty line
If I call the function in whatever form immediately before a println, an empty line is printed regardless of what's inside the println.
lastPrime::Int = generatePrimes(10^7)[end]
println("This doesn't print") # output: empty line
println("This does print") # output: This does print
If I call the function (or print the pre-generated-and-stored function result) inside a println, anything before the function call (that's also inside the println) isn't printed. The 9999991 and anything else there may be after the function call is printed only if there is something else inside the println before the function call.
# Example 1
println(generatePrimes(10^7)[end]) # output: empty line
# Example 2
println("This first part doesn't print", generatePrimes(10^7)[end]) # output: 9999991
# Example 3
println("This first part doesn't print", generatePrimes(10^7)[end], " prints") # output: 9999991 prints
# Example 4
println(generatePrimes(10^7)[end], "prime doesn't print") # output: prime doesn't print
I could probably list twenty different variations of this same thing, but that probably wouldn't make things any clearer. In every single case version of this issue I've seen so far, the issue only manifests if there's that function call somewhere; println prints large integers just fine. That said, please let me know if anyone feels like they need more info. Thanks so much!
Most likely you are running this code from Atom Juno which recently has some issues with buffering standard output (already reported by others and I also sometimes have this problem).
One thing you can try to do is to flush your standard output
flush(stdout)
Like with any unstable bug restarting Atom Juno also seems to help.
I had the same issue. For me, changing the terminal renderer (File -> Settings -> Packages -> julia-client -> Terminal Options) from webgl to canvas (see pic below) seems to solve the issue.
change terminal renderer
I've also encountered this problem many times. (First time, it was triggered after using the debugger. It is probably unrelated but I have been using Julia+Juno for 2 weeks prior to this issue.)
In my case, the code before the println statement needed to have multiple dictionary assignation (with new keys) in order to trigger the behavior.
I also confirmed that the same code ran in Command Prompt (with same Julia interpreter) prints fine. Any hints about how to further investigate this will be appreciated.
I temporarily solve this issue by printing to stderr, thinking that this stream has more stringent flush mechanism: println(stderr, "hello!")

How do I directly make an val binding to an option value in SML?

val SOME i = Int.fromString e
I have a line like this on my code and smlnj shows me this warning
vm.sml:84.7-84.32 Warning: binding not exhaustive
SOME i = ...
Is this bad practice? Should I use a function to handle the option or am I missing something?
If you're just working on a small script you'll run once, it's not necessarily bad practice: If Int.fromString e fails (and returns NONE instead of SOME _), then the value binding will fail and an exception will be raised to the appropriate handler (or the program will exit, if ther is no handler). To disable this warning, you can run the top-level statement (for SML-NJ 110.96): Control.MC.bindNonExhaustiveWarn := false;.
As an alternative approach, you could throw a custom exception:
val i =
case Int.fromString e
of SOME i => i
| NONE => raise Fail ("Expected string value to be parseable as an int; got: " ^ e)
The exception message should be written in a way that's appropriate to the provenance of the e value. (If e comes from command-line input, the program should tell the user that a number was expected there; if e comes from a file, the program should tell the user which file is formatted incorrectly and where the formatting error was found.)
As yet another alternative: If your program is meant to be long-running and builds up a lot of state, it wouldn't be very user-friendly if the program crashed as soon as the user entered an ill-formed string on the command line. (The user would be quite sad in this case, as all the state they built up in the program would have been lost.) In this case, you could repeatedly read from stdin until the user types in input that can be parsed as an int. This is incidentally more-or-less what the SML/NJ REPL does: instead of something like val SOME parsedProgram = SMLofNJ.parse (getUserInput ()), it would want to do something like:
fun getNextParsedProgram () =
case SMLofNJ.parse (getUserInput ())
of NONE => (print "ERROR: failed to parse\n"; getNextParsedProgram ())
| SOME parsedProgram => parsedProgram
In summary,
For a short-lived script or a script you don't intend on running often, turning off the warning is a fine option.
For a program where it's unexpected that e would be an unparseable string, you could raise a custom exception that explains what went wrong and how the user can fix it.
For longer-lived programs where better error handling is desired, you should respect the NONE case by pattern-matching on the result of fromString, which forces you to come up with some sort of error-handling behavior.

How to quit/exit from file included in the terminal

What can I do within a file "example.jl" to exit/return from a call to include() in the command line
julia> include("example.jl")
without existing julia itself. quit() will just terminate julia itself.
Edit: For me this would be useful while interactively developing code, for example to include a test file and return from the execution to the julia prompt when a certain condition is met or do only compile the tests I am currently working on without reorganizing the code to much.
I'm not quite sure what you're looking to do, but it sounds like you might be better off writing your code as a function, and use a return to exit. You could even call the function in the include.
Kristoffer will not love it, but
stop(text="Stop.") = throw(StopException(text))
struct StopException{T}
S::T
end
function Base.showerror(io::IO, ex::StopException, bt; backtrace=true)
Base.with_output_color(get(io, :color, false) ? :green : :nothing, io) do io
showerror(io, ex.S)
end
end
will give a nice, less alarming message than just throwing an error.
julia> stop("Stopped. Reason: Converged.")
ERROR: "Stopped. Reason: Converged."
Source: https://discourse.julialang.org/t/a-julia-equivalent-to-rs-stop/36568/12
You have a latent need for a debugging workflow in Julia. If you use Revise.jl and Rebugger.jl you can do exactly what you are asking for.
You can put in a breakpoint and step into code that is in an included file.
If you include a file from the julia prompt that you want tracked by Revise.jl, you need to use includet(.
The keyboard shortcuts in Rebugger let you iterate and inspect variables and modify code and rerun it from within an included file with real values.
Revise lets you reload functions and modules without needing to restart a julia session to pick up the changes.
https://timholy.github.io/Rebugger.jl/stable/
https://timholy.github.io/Revise.jl/stable/
The combination is very powerful and is described deeply by Tim Holy.
https://www.youtube.com/watch?v=SU0SmQnnGys
https://youtu.be/KuM0AGaN09s?t=515
Note that there are some limitations with Revise, such as it doesn't reset global variables, so if you are using some global count or something, it won't reset it for the next run through or when you go back into it. Also it isn't great with runtests.jl and the Test package. So as you develop with Revise, when you are done, you move it into your runtests.jl.
Also the Juno IDE (Atom + uber-juno package) has good support for code inspection and running line by line and the debugging has gotten some good support lately. I've used Rebugger from the julia prompt more than from the Juno IDE.
Hope that helps.
#DanielArndt is right.
It's just create a dummy function in your include file and put all the code inside (except other functions and variable declaration part that will be place before). So you can use return where you wish. The variables that only are used in the local context can stay inside dummy function. Then it's just call the new function in the end.
Suppose that the previous code is:
function func1(...)
....
end
function func2(...)
....
end
var1 = valor1
var2 = valor2
localVar = valor3
1st code part
# I want exit here!
2nd code part
Your code will look like this:
var1 = valor1
var2 = valor2
function func1(...)
....
end
function func2(...)
....
end
function dummy()
localVar = valor3
1st code part
return # it's the last running line!
2nd code part
end
dummy()
Other possibility is placing the top variables inside a function with a global prefix.
function dummy()
global var1 = valor1
global var2 = valor2
...
end
That global variables can be used inside auxiliary function (static scope) and outside in the REPL
Another variant only declares the variables and its posterior use is free
function dummy()
global var1, var2
...
end

Handling errors in math functions

What is good practice for error handling in math-related functions? I'm building up a library (module) of specialized functions and my main purpose is to make debugging easier for the code calling these functions -- not to make a shiny user-friendly error handling facility.
Below is a simple example in VBA, but I'm interested in hearing from other languages as well. I'm not quite sure where I should be returning an error message/status/flag. As an extra argument?
Function AddArrays(arr1, arr2)
Dim i As Long
Dim result As Variant
' Some error trapping code here, e.g.
' - Are input arrays of same size?
' - Are input arrays numeric? (can't add strings, objects...)
' - Etc.
' If no errors found, do the actual work...
ReDim result(LBound(arr1) To UBound(arr1))
For i = LBound(arr1) To UBound(arr1)
result(i) = arr1(i) + arr2(i)
Next i
AddArrays = result
End Function
or something like the following. The function returns a boolean "success" flag (as in the example below, which would return False if the input arrays weren't numeric etc.), or an error number/message of some other type.
Function AddArrays(arr1, arr2, result) As Boolean
' same code as above
AddArrays = booSuccess
End Function
However I'm not too crazy about this, since it ruins the nice and readable calling syntax, i.e. can't say c = AddArrays(a,b) anymore.
I'm open to suggestions!
Obviously error handling in general is a big topic, and what the best practice is depends a lot on the capabilities of the language you're working with and how the routine you're coding fits in with other routines. So I'll constrain my answer to VBA (used within Excel) and library-type routines of the sort you're describing.
Exceptions vs. Error Codes in Library Routines
In this case, I would not use a return code. VBA supports a form of exception handling that, while not as powerful as the more standard form found in C++/Java/??.NET, is pretty similar. So the advice from those languages generally applies. You use exceptions to tell calling routines that the called routine can't do it's job for whatever reason. You handle exceptions at the lowest level where you can do something meaningful about that failue.
Bjarne Stroustrup gives a very good explanation of why exceptions are better than error codes for this kind of situation in this book. (The book is about C++, but the principles behind C++ exception handling and VBA error handling are the same.)
http://www2.research.att.com/~bs/3rd.html
Here is a nice excerpt from Section 8.3:
When a program is composed of separate
modules, and especially when those
modules come from separately developed
libraries, error handling needs to be
separated into two distinct parts: [1]
The reporting of error conditions that
cannot be resolved locally [2] The
handling of errors detected elsewhere
The author of a library can detect
runtime errors but does not in general
have any idea what to do about them.
The user of a library may know how to
cope with such errors but cannot
detect them – or else they would be
handled in the user’s code and not
left for the library to find.
Sections 14.1 and 14.9 also address exceptions vs. error codes in a library context. (There is a copy of the book online at archive.org.)
There is probably lots more about this on stackoverflow. I just found this, for example:
Exception vs. error-code vs. assert
(There can be pitfalls involving proper management of resources that must be cleaned up when using exceptions, but they don't really apply here.)
Exceptions in VBA
Here is how raising an exception looks in VBA (although the VBA terminology is "raising an error"):
Function AddArrays(arr1, arr2)
Dim i As Long
Dim result As Variant
' Some error finding code here, e.g.
' - Are input arrays of same size?
' - Are input arrays numeric? (can't add strings, objects...)
' - Etc.
'Assume errorsFound is a variable you populated above...
If errorsFound Then
Call Err.Raise(SOME_BAD_INPUT_CONSTANT) 'See help about the VBA Err object. (SOME_BAD_INPUT_CONSTANT is something you would have defined.)
End If
' If no errors found, do the actual work...
ReDim result(LBound(arr1) To UBound(arr1))
For i = LBound(arr1) To UBound(arr1)
result(i) = arr1(i) + arr2(i)
Next i
AddArrays = result
End Function
If this routine doesn't catch the error, VBA will give other routines above it in the call stack a chance to (See this: VBA Error "Bubble Up"). Here is how a caller might do so:
Public Function addExcelArrays(a1, a2)
On Error Goto EH
addExcelArrays = AddArrays(a1, a2)
Exit Function
EH:
'ERR_VBA_TYPE_MISMATCH isn't defined by VBA, but it's value is 13...
If Err.Number = SOME_BAD_INPUT_CONSTANT Or Err.Number = ERR_VBA_TYPE_MISMATCH Then
'We expected this might happen every so often...
addExcelArrays = CVErr(xlErrValue)
Else
'We don't know what happened...
Call debugAlertUnexpectedError() 'This is something you would have defined
End If
End Function
What "do something meaningful" means depends on the context of your application. In the case of my caller example above, it decides that some errors should be handled by returning an error value that Excel can put in a worksheet cell, while others require a nasty alert. (Here's where the case of VBA within Excel is actually not a bad specific example, because lots of applications make a distinction between internal and external routines, and between exceptions you expect to be able to handle and error conditions that you just want to know about but for which you have no response.)
Don't Forget Assertions
Because you mentioned debugging, it's also worth noting the role of assertions. If you expect AddArrays to only ever be called by routines that have actually created their own arrays or otherwise verified they are using arrays, you might do this:
Function AddArrays(arr1, arr2)
Dim i As Long
Dim result As Variant
Debug.Assert IsArray(arr1)
Debug.Assert IsArray(arr2)
'rest of code...
End Function
A fantastic discussion of the difference between assertions and exceptions is here:
Debug.Assert vs Exception Throwing
I gave an example here:
Is assert evil?
Some VBA Advice About General Array Handling Routines
Finally, as a VBA-specific note, there are VBA variants and arrays come with a number of pitfalls that must be avoided when you're trying to write general library routines. Arrays might have more than one dimension, their elements might be objects or other arrays, their start and end indices might be anything, etc. Here is an example (untested and not trying to be exhaustive) that accounts for some of that:
'NOTE: This has not been tested and isn't necessarily exhaustive! It's just
'an example!
Function addArrays(arr1, arr2)
'Note use of some other library functions you might have...
'* isVect(v) returns True only if v is an array of one and only one
' dimension
'* lengthOfArr(v) returns the size of an array in the first dimension
'* check(condition, errNum) raises an error with Err.Number = errNum if
' condition is False
'Assert stuff that you assume your caller (which is part of your
'application) has already done - i.e. you assume the caller created
'the inputs, or has already dealt with grossly-malformed inputs
Debug.Assert isVect(arr1)
Debug.Assert isVect(arr2)
Debug.Assert lengthOfArr(arr1) = lengthOfArr(arr2)
Debug.Assert lengthOfArr(arr1) > 0
'Account for VBA array index flexibility hell...
ReDim result(1 To lengthOfArr(arr1)) As Double
Dim indResult As Long
Dim ind1 As Long
ind1 = LBound(arr1)
Dim ind2 As Long
ind2 = LBound(arr2)
Dim v1
Dim v2
For indResult = 1 To lengthOfArr(arr1)
'Note implicit coercion of ranges to values. Note that VBA will raise
'an error if an object with no default property is assigned to a
'variant.
v1 = arr1(ind1)
v2 = arr2(ind2)
'Raise errors if we have any non-numbers. (Don't count a string
'with numeric text as a number).
Call check(IsNumeric(v1) And VarType(v1) <> vbString, xlErrValue)
Call check(IsNumeric(v2) And VarType(v2) <> vbString, xlErrValue)
'Now we don't expect this to raise errors.
result(indResult) = v1 + v2
ind1 = ind1 + 1
ind2 = ind2 + 1
Next indResult
addArrays = result
End Function
There's lots of ways to trap errors, some better than others. Alot of it depends on on the nature of the error and how you want to handle it.
1st: In your examples, you aren't handling the basic compiling & runtime errors (see code below).
Function Foobar (Arg1, Arg2)
On Error goto EH
Do stuff
Exit Function
EH:
msgbox "Error" & Err.Description
End Function
2nd: Using the framework example above, you can add all the if-then logical error trapping statements you want & feed it to the EH step. You can even add multiple EH steps if your function is complex enough. Setting things up this way allows you to find the particular function where your logic error occurred.
3rd: In your last example, ending that function as a boolean is not the best method. If you were able to add the 2 arrays, then that function should return the resultant array. If not, it should throw up a msgbox-style error.
4th: I recently started doing a little trick that can be very helpful in some situations. In your VBA Editor, go to Tools->Options->General->Break on ALL errors. This is very helpful when you already have your error handling code in place, but you want to go the exact line where the error occurred and you don't feel like deleting perfectly good code.
Example: Let's say you want to catch an error that wouldn't be caught normally by VBA, i.e. an integer variable should always have a value >2. Somewhere in your code, say If intvar<=2 then goto EH. Then in your EH step, add If intvar<=2 then msgbox "Intvar=" & Intvar.
First, PowerUser gave you a good answer already--this is an expansion on that one.
A trick that I just learned is the "double resume", thus:
Function Foobar (Arg1, Arg2)
On Error goto EH
Do stuff
FuncExit:
Exit Function
EH:
msgbox "Error" & Err.Description
Resume FuncExit
Resume
End Function
What happens here is that in the execution of finished code, your code throws up a MsgBox when an error is encountered, then runs the Exit Function statement & goes on its way (just the same as dropping out the bottom with End Function). However, when you're debugging and you get that MsgBox you instead do a manual Ctrl-Break, then set next statement (Ctrl-F9) to the unadorned Resume and press F8 to step--it goes right back to the line that threw the error. You don't even have to take the extra Resume statements out, since they will never execute without manual intervention.
The other point on which I want to argue (gently) with PowerUser is in the final example. I think it's best to avoid unneeded GoTo statements. A better approach is If intvar<=2 then err.raise SomeCustomNumber. Make sure you use a number that isn't already in use--search 'VB custom error' for more information.

Resources