Capture and handle multiple errors in sequence - r

I have an expression that spits out multiple errors:
stop("first")
stop("second")
stop("third")
print("success")
Is there a way to make this expression run all the way to the end (and, if possible, store all of the errors)?
The problem with tryCatch is that it stops execution on the first handled condition (so this will print "There was an error" exactly once).
tryCatch({
stop("first")
stop("second")
stop("third")
print("success")
}, error = function(e) print("There was an error"))
I read that withCallingHandlers will capture conditions and continue execution, so I tried something like this...
withCallingHandlers(
{...},
error = function(e) print("There was an error")
)
...but, although this does print the message, it also fails after the first error, which, as far as I can tell, is because the default restart just runs the current line again.
I think what I need to do is write a custom restart that simply skips the current expression and jumps to the next one, but I'm at a bit of a loss about how to go about that.
(Background: Basically, I'm trying to replicate what happens inside testthat::test_that, which is that even though each failing expectation throws an error, all expectations are run before quitting).

Related

Change "Error" label in stop function on R

Is it possible to change the Error label inside the stop() function?
For example:
stop("This is a error!")
Error: This is a error!
change to:
Incorrect: This is a error!
I need to use stop function, not warning or message functions.
There's a risk that this could be confusing for users seeing something that's an stopping the code without explicitly announcing that it's an error.
If you can guarantee that the user is using English, you can hack it a bit using the backspace character (\b) with:
stop("\b\b\b\b\b\b\bIncorrect: This is a error!")
but this relies on knowing how long the word "Error" is in English, a version that also works in other languages would be trickier.
Edited to add that even in English this doesn't work from inside a function:
throw_an_error <- function()stop("\b\b\b\b\b\b\bIncorrect: This is a error!")
throw_an_error()
# Error in throw_an_errIncorrect: This is a error!

How to systematically populate a whitelist for a sandboxing program?

On pp. 260-263 of Programming in Lua (4th ed.), the author discusses how to implement "sandboxing" (i.e. the running of untrusted code) in Lua.
When it comes to imposing limiting the functions that untrusted code can run, he recommends a "whitelist approach":
We should never think in terms of what functions to remove, but what functions to add.
This question is about tools and techniques for putting this suggestion into practice. (I expect there will be confusion on this point I want to emphasize it upfront.)
The author gives the following code as an illustration of a sandbox program based on a whitelist of allowed functions. (I have added or moved around some comments, and removed some blank lines, but I've copied the executable content verbatim from the book).
-- From p. 263 of *Programming in Lua* (4th ed.)
-- Listing 25.6. Using hooks to bar calls to unauthorized functions
local debug = require "debug"
local steplimit = 1000 -- maximum "steps" that can be performed
local count = 0 -- counter for steps
local validfunc = { -- set of authorized functions
[string.upper] = true,
[string.lower] = true,
... -- other authorized functions
}
local function hook (event)
if event == "call" then
local info = debug.getinfo(2, "fn")
if not validfunc[info.func] then
error("calling bad function: " .. (info.name or "?"))
end
end
count = count + 1
if count > steplimit then
error("script uses too much CPU")
end
end
local f = assert(loadfile(arg[1], "t", {})) -- load chunk
debug.sethook(hook, "", 100) -- set hook
f() -- run chunk
Right off the bat I am puzzled by this code, since the hook tests for event type (if event == "call" then...), and yet, when the hook is set, only count events are requested (debug.sethook(hook, "", 100)). Therefore, the whole song-and-dance with validfunc is for naught.
Maybe it is a typo. So I tried experimenting with this code, but I found it very difficult to put the whitelist technique in practice. The example below is a very simplified illustration of the type of problems I ran into.
First, here is a slightly modified version of the author's code.
#!/usr/bin/env lua5.3
-- Filename: sandbox
-- ----------------------------------------------------------------------------
local debug = require "debug"
local steplimit = 1000 -- maximum "steps" that can be performed
local count = 0 -- counter for steps
local validfunc = { -- set of authorized functions
[string.upper] = true,
[string.lower] = true,
[io.stdout.write] = true,
-- ... -- other authorized functions
}
local function hook (event)
if event == "call" then
local info = debug.getinfo(2, "fnS")
if not validfunc[info.func] then
error(string.format("calling bad function (%s:%d): %s",
info.short_src, info.linedefined, (info.name or "?")))
end
end
count = count + 1
if count > steplimit then
error("script uses too much CPU")
end
end
local f = assert(loadfile(arg[1], "t", {})) -- load chunk
validfunc[f] = true
debug.sethook(hook, "c", 100) -- set hook
f() -- run chunk
The most significant differences in the second snippet relative to the first one are:
the call to debug.sethook has "c" as mask;
the f function for the loaded chunk gets added to the validfunc whitelist;
io.stdout.write is added to the validfunc whitelist;
When I use this sandbox program to run the one-line script shown below:
# Filename: helloworld.lua
io.stdout:write("Hello, World!\n")
...I get the following error:
% ./sandbox helloworld.lua
lua5.3: ./sandbox:20: calling bad function ([C]:-1): __index
stack traceback:
[C]: in function 'error'
./sandbox:20: in function <./sandbox:16>
[C]: in metamethod '__index'
helloworld.lua:3: in local 'f'
./sandbox:34: in main chunk
[C]: in ?
I tried to fix this by adding the following to validfunc:
[getmetatable(io.stdout).__index] = true,
...but I still get pretty much the same error. I could go on guessing and trying more things to add, but this is what I would like to avoid.
I have two related questions:
What can I add to validfunc so that sandbox will run helloworld (as is) to completion?
More importantly, what is a systematic way to find determine what to add to a whitelist table?
Part (2) is the heart of this post. I am looking for tools/techniques that remove the guesswork from the problem of populating a whitelist table.
(I know that I can get helloworld to work if I replace io.stdout:write with print, register print in sandbox's validfunc, and pass {print = print} as the last argument to loadfile, but doing this does not answer the general question of how to systematically determine what needs to be added to the whitelist to allow some specific code to work in the sandbox.)
EDIT: Ask #DarkWiiPlayer pointed out, the calling bad function error is being triggered by the calling of an unregistered function (__index?), which happened as part of the response to an earlier attempt to index a nil value error. So, this post's questions are all about systematically determining what to add to validfunc to allow Lua to emit the attempt to index a nil value error normally.
I should add that the question of which function's call triggered the hook's execution responsible for the calling bad function error message is at the moment completely unclear. This error message blames the error on __index, but I suspect that this may be a red herring, possibly due to a bug in Lua.
Why suspect a bug in Lua? If I change the error call in sandbox slightly to
error(string.format("calling bad function (%s:%d): %s (%s)",
info.short_src, info.linedefined, (info.name or "?"),
info.func))
...then the error message looks like this:
lua5.3: ./sandbox:20: calling bad function ([C]:-1): __index (function: 0x55b391b79ef0)
stack traceback:
[C]: in function 'error'
./sandbox:20: in function <./sandbox:16>
[C]: in metamethod '__index'
helloworld.lua:3: in local 'f'
./sandbox:34: in main chunk
[C]: in ?
Nothing surprising there, but if now I change helloworld.lua to
# Filename: helloworld.lua
nonexistent()
io.stdout:write("Hello, World!\n")
...and run it under sandbox, the error message becomes
lua5.3: ./sandbox:20: calling bad function ([C]:-1): nonexistent (function: 0x556a161cdef0)
stack traceback:
[C]: in function 'error'
./sandbox:20: in function <./sandbox:16>
[C]: in global 'nonexistent'
helloworld.lua:3: in local 'f'
./sandbox:34: in main chunk
[C]: in ?
From this error message, one may conclude that nonexistent is a real function; after all, it's sitting right there at 0x556a161cdef0! But we know that nonexistent lives up to its name: it doesn't exist!
The whiff of a bug is definitely in the air. It could be that the function that is triggering the hook should really be excluded from those that trigger such "c"-masked hooks? Be that as it may, it appears that, in this particular situation, the call to debug.info is returning inconsistent information (since the name of the function [e.g. nonexistent] clearly does not correspond at all to the actual function object [e.g. function: 0x556a161cdef0] that is supposedly triggering the hook).
(Final answer at the bottom, feel free to skip until the <hr> line)
I'll explain my debugging step by step.
This is a really weird phenomenon. After some testing, I've managed to narrow it down a bit:
Since you pass {} to load, the function runs with an empty environment, so io is, in fact, nil (and io.stdout would error anyway)
The error happens directly when attempting to index io (which is a nil value)
The functio __index is a C function (see error message)
My first intuition was that __index was called somewhere internally. Thus, to find out what it does, I decided to look at its locals in hopes of guessing what it does.
A quick helper function I threw together:
local function locals(f)
return function(f, n)
local name, value = debug.getlocal(f+1, n)
if name then
return n+1, name, value
end
end, f, 1
end
Insert that right before the line where the error is raised:
for idx, name, value in locals(2) do
print(name, value)
end
error(string.format("calling bad function (%s:%d): %s", info.short_src, info.linedefined, (info.name or "?")))
This led to an interesting result:
(*temporary) stdin:43: attempt to index a nil value (global 'io')
(*temporary) table: 0x563cef2fd170
lua: stdin:29: calling bad function ([C]:-1): __index
stack traceback:
[C]: in function 'error'
stdin:29: in function <stdin:21>
[C]: in metamethod '__index'
stdin:43: in function 'f'
stdin:49: in main chunk
[C]: in ?
shell returned 1
Why is there a temporary string value with a completely different error message?
By the way, this error makes total sense; io does not exist because of the empty environment, so indexing it should obviously raise just that error.
It's honestly a very interesting error, but I'll leave it at this, as you're learning the language and this hint might be enough for you to figure it out on your own. It's also a very nice chance to actually use (and get to know) the debug module in a more practical context.
Actual Solution
After some time has now passed, I came back to add a proper solution to this problem, but I really already did just that. The weird error reporting is just Lua being weird. The real error is the empty environment that's set when loading the chunk, as I mentioned a few paragraphs above.
From the manual:
load (chunk [, chunkname [, mode [, env]]])
Loads a chunk.
[...]
If the resulting function has upvalues, the first upvalue is set to the value of env, if that parameter is given, or to the value of the global environment. Other upvalues are initialized with nil. (When you load a main chunk, the resulting function will always have exactly one upvalue, the _ENV variable (see ยง2.2). However, when you load a binary chunk created from a function (see string.dump), the resulting function can have an arbitrary number of upvalues.) All upvalues are fresh, that is, they are not shared with any other function.
[...]
Now, in a "main chunk", i.e. one loaded from a text Lua file, the first (and only) upvalue is always the environment of the chunk, so where it will look for "globals" (this is slightly different in Lua 5.1). Since an empty table is passed in, the chunk has no access to any of the global variables like string or io.
Therefore, when the function f() tries to index io, Lua throws an error "attempt to index a nil value", because io is nil. For whatever reason Lua then makes some internal function calls that end up triggering the blacklist, causing a new error that shadows the previous one; this makes debugging this error extremely inconvenient and almost impossible without using the debug library to get additional information about the call stack.
I ultimately only realized this myself after I noticed the original error message while looking at the locals of the function that made the blocked call.
I hope this solves the problem :)

Handling of condition objects in R

What exactly happens in
tryCatch(warning("W"),warning = function(c) cat(c$message))
As far as I understand this the call warning("W") creates a condition object c of type warning and then someone (who?) looks for a handler (where? and recognizes the handler how?) and calls the handler on c.
The task of tryCatch is only to "establish" the handlers. What does this mean? In short the question is: how exactly does the condition object get to its handler?
In my experience, tryCatch is used to catch and either ignore or specifically handle errors and warnings. While you can do warnings with this function, I see it more often used with withCallingHandlers (and invokeRestart("muffleWarning")).
The Advanced R book is a good reference for many topics; for this, I'd recommend two chapters:
Exceptions and debugging, where Hadley highlights one key difference:
withCallingHandlers() is a variant of tryCatch() that establishes local handlers, whereas tryCatch() registers exiting handlers. Local handlers are called in the same context as where the condition is signalled, without interrupting the execution of the function. When a exiting handler from tryCatch() is called, the execution of the function is interrupted and the handler is called. withCallingHandlers() is rarely needed, but is useful to be aware of.
Bold emphasis is mine, to highlight that tryCatch interrupts, withCallingHandlers does not. This means that when a warning is raised, nothing else is executed:
tryCatch({warning("foo");message("bar");}, warning=function(w) conditionMessage(w))
# [1] "foo"
tryCatch({warning("foo");message("bar");})
# Warning in tryCatchList(expr, classes, parentenv, handlers) : foo
# bar
Further details/examples are in the Beyond exception handling chapter.
tryCatch executes the expression in a context where anything "raised" is optionally caught and potentially altered, ignored, logged, etc. A common method I see is to accept any error and return a known entity (now without an error), such as
ret <- tryCatch(
something_that_fails(),
error = function(e) FALSE)
Unlike other operating systems that allow precise filters on what to handle (e.g., python and its try: ... except ValueError: syntax, ref: https://docs.python.org/3/tutorial/errors.html#handling-exceptions), R is a bit coarser in that it catches all errors you can get to figure out what kind of error it is.
If you look at the source of tryCatch and trace around its use of its self-defined functions, you'll see that ultimately it calls an R function .addCondHands that includes the list of handlers, matching on the names of the handlers (typically error and warning, though perhaps others might be used?). (The source for that function can be found here, though I don't find it very helpful in this context).
I don't know exactly how to answer "how exactly ... get to its handler", other than an exception is thrown, it is caught, and the handler is executed with the error object as an argument.

What is wrong with my message passing example?

I am attempting to send a message from one process that I spawned to another for an assignment, I feel like I am very close here, but I think my syntax is just a bit off:
-module(assignment6).
-export([start/1, process1/2, process2/0, send_message/2]).
process1(N, Pid) ->
Message = "This is the original Message",
if
N == 1 ->
timer:sleep(3000),
send_message(Pid, Message);
N > 1 ->
timer:sleep(3000),
send_message(Pid, Message),
process1(N-1, Pid);
true ->
io:fwrite("Negative/0, Int/Floating-Point Numbers not allowed")
end.
process2() ->
recieve
Message ->
io:fwrite(Message),
io:fwrite("~n");
end.
send_message(Pid, Message) ->
Pid ! {Message}.
start(N) ->
Pid = spawn(assignment6, process2, []),
spawn(assignment6, process1, [N, Pid]).
The goal of this program is that the Message, will be printed out N times when the function is started, but be delayed enough so that I can hot-swap the wording of the message mid-run. I just can't quite get the Message to process2 for printout.
Four small things:
It's spelled receive, not recieve
Remove the semicolon in process2. The last clause in a receive expression does not have a terminating semicolon. You can see this in the if expression in process1: the first two clauses end with a semicolon, but the third one does not.
In process2, print the message like this:
io:fwrite("~p~n", [Message])
Since Message is a tuple, not a string, passing it as the first argument to io:fwrite causes a badarg error. Let's ask io:fwrite to format it for us instead.
process2 should probably call itself after printing the message. Otherwise, it will receive one message and then exit.
So now you can run the code, and while it's running you can load a new version of the module with a different message (so called "hot code swapping"). Will that change the message being printed? Why / why not?
It won't. process1 does a local call to itself, which means that it stays in the old version of the module. Do an external call instead (explicitly specifying the module: assignment6:process1(N-1, Pid)), and it will switch to the new version.

stopifnot() vs. assertError()

I wonder what the differences between stopifnot() and assertError() are:
assertError() is not found by default (You'll have to load the "tools" package first), but stopifnot() is.
More significantly, assertError() always throws an error message, even if I pass arguments like TRUE or FALSE, while stopifnot() does the obvious and expected thing.
Reading the manual page did not help. What is the correct use instead of assertError(length(x) != 7)? If x is undefined, the statement produces no error, but as soon as it is defined, it is producing errors, independent of the length of x (7 or not).
The main difference is where they should be used.
stopIfnot aim at stopping an execution if some conditions are not met during the run where assertError aim at testing your code.
assertError expect it's parameter to raise an error, this is what happens when x is not defined, there's an error
> length(x) != 7
Error: object 'x' not found
When you pass this expression to assertError, it raise an error and assertError return the conditions met (the error in itself). This allow you to test the failure cases of your code.
So assertError is mostly used in tests cases in a Test Driven Development pattern (TDD), when your code/function is supposed to raise an error for some specific parameters and ensure when you update your function later you're not breaking it.
Example usage of stopifnot and assertError:
mydiv <- function(a,b) {
stopifnot(b>0)
a/b
}
And now lets make a test to ensure this will raise an error if we pass "b" as 0:
tryCatch(
assertError(mydiv(3,0)),
error = function(e) { print("Warning, mydiv accept to divide by 0") }
)
Running this code produce no output, desired behavior.
Now if we comment the stopifnot in mydiv like this:
mydiv <- function(a,b) {
#stopifnot(abs(b)>0)
a/b
}
And testing again the tryCatch block, we get this output:
[1] "Warning, mydiv accept to divide by 0"
This is a small example of testing a function really throw an error as expected.
The tryCatch block is just to showcase with a different message, I hope this give more light on the subject.

Resources