I have a difficult time classifying “infinite loop error”. It does not necessarily cause a system crash, but it sometimes does.
Is there a standard answer in the IT community?
Related
My question is: Why does an infinite recursion crash, but not an infinite loop? They both feel like normal forever iterators yet one crashes while the other doesn't. I am looking for the hardware/low-level response. I usually get stackoverflow error when experimenting this on Jupyter with Python. Also, when I mean low-level response, I mean what happens in the computer hardware(RAM, CPU) that causes the infinite recursion to crash, but not the infinite iteration.
Because an infinite recursion keep storing recursive calls in the stack (which is one of the memories held by your application), so it will eventually fill the stack completely, hence you get the classic stack overflow error.
In an infinite loop, you don't store anything and frequently you just keep updating variables and things like that, which won't occupy more memory than these variables already do, so nothing will break. However, you can have an infinite loop breaking if you add things in the memory inside your loop, like adding an element to a list inside the loop.
I am using CVXPY (version 1.0) to solve a quadratic program (QP) and I often get this exception:
SolverError: Solver 'xxx' failed. Try another solver.
which makes my program really fragile. I have tried different solvers, including CVXOPT, OSQP, ECOS, ECOS_BB, SCS. They all have more or less the same problem. I noticed that when I make the stopping criteria of the solver more strict (e.g., decrease the absolute error tolerance), I get SolverError more frequently, and when I make it less strict, the SolverError problem is attenuated and even disappears. I also find that the way that CVXPY throws SolverError is stochastic: if I run the same program many times, there are some runs having SolverError and others get the optimal result.
Although I can avoid SolverError just by trying more times and lowering the stopping criteria, I really want to understand the real specific reasons behind the exception
SolverError: Solver 'xxx' failed. Try another solver.
This error is not really informative and I have no clues on what to do to improve the problem solving robustness. Are its causes specific to a solver? Is this exception thrown for a set of well-defined situations? Or is it just a way of saying "something goes wrong for unknown reasons"? What reasons might those be?
If you have a solver error, you need to either debug by calling the solve method with verbose=True to see the detailed error message or use a more robust commercial solver like MOSEK. The specific reasons for solver errors depend on the solver used. A common cause is having too tight a numerical tolerance or having badly scaled data (i.e., the dynamic range of the floats in your program is too large). I will modify the SolverError message to mention using verbose=True.
Having spent more time chasing my tail on this than is healthy, I'm just going to have to ask:
In R, how can one figure out what error handlers are in force at a given point in the code? Somebody somewhere is gracefully intercepting my errors, and I want them to stop!
I have two questions-
Q1. Is there a more efficient way to handle the error situation in MPI, other than check-point/rollback? I see that if a node "dies", the program halts abruptly.. Is there any way to go ahead with the execution after a node dies ?? (no issues if it is at the cost of accuracy)
Q2. I read in "http://stackoverflow.com/questions/144309/what-is-the-best-mpi-implementation", that OpenMPI has better fault tolerance and recently MPICH-2 has also come up with similar features.. does anybody know what they are and how to use them? is it a "mode"? can they help in the situation stated in Q1 ?
kindly reply. Thank you.
MPI - all implementations - have had the ability to continue after an error for a while. The default is to die - that is, the default error handler is MPI_ERRORS_ARE_FATAL - but that can be set (eg, see the discussion here). But the standard doesn't currently much beyond that; that is, it's hard to recover and continue after such an error. If your program is sufficiently simple - some sort of master-worker type of setup - it may be possible to continue this way.
The MPI forum is currently working on what will become MPI-3, and error handling and fault tolerance will be an important component of the new standard (there's a working group dedicated to the topic). Until that work is complete, however, the only way to get stronger fault tolerance out of MPI is to use earlier, nonstandard, extensions. FT-MPI was a project that developed a very robust MPI, but unfortuantely it's based on MPI1.2; a very early version of the standard. The claim here is that they're now working with OpenMPI, but I don't know what's become of that. There's MPICH-V, based on MPI2, but that's more checkpoint-restart based than what I think you're looking for.
Updated to add: The fault tolerance didn't make it into MPI-3, but the working group continues its work and the expectation is that something will result from that before too long.
i am creating a system. What i want to know is if a msg is unsupported what should it do? should i throw saying unsupported msg? should i return 0 or -1? or should i set an errno (base->errno_). Some messages i wouldnt care if there was an error (such as setBorderColour). Others i would (addText or perhaps save if i create a save cmd).
I want to know what the best method is for 1) coding quickly 2) debugging 3) extending and maintenance. I may make debugging 3rd, its hard to debug ATM but thats bc there is a lot of missing code which i didnt fill in. Actual bugs arent hard to correct. Whats the best way to let the user know there is an error?
The system works something like this but not exactly the same. This is C style and mycode has a bunch of inline functions that wrap settext(const char*text){ to msg(this, esettext, text)
Base base2, base;
base = get_root();
base2 = msg(base, create, BASE_TYPE);
msg(base2, setText, "my text");
const char *p = (const char *)msg(base2, getText);
Generally if it's C++, prefer exceptions unless performance is critical or unless you may be running in an environment (e.g. an embedded platform) that does not support exceptions. Exceptions are by far the best choice for debugging because they are very noticeable when they occur and are ignored. Further, exceptions are self-documenting. They have their own type name and usually a contained message that explains the error. Return codes and errno require separate code definitions and some kind of out-of-band way of communicating what the codes mean in any given context (e.g. man pages, comments).
For coding quickly, return codes are probably easier since they don't involve potentially defining your own exception types, and often the error checking code is not as verbose as with exceptions. But of course the big risk is that it is much easier to silently ignore error return codes, leading to problems that may not be noticed until well after they occur, making debugging and maintenance a nightmare.
Try to avoid ever using errno, since it's very error-prone itself. It's a global, so you never know who is resetting it, and it is most definitively not thread safe.
Edit: I just realized you meant an errno member variable and not the C-style errno. That's better in that it's not global, but you still need additional constructs to make it thread safe (if your app is multi-threaded), and it retains all the problems of a return code.
Returning an error code requires discipline because the error code must be explicitly checked and then passed up. We wrote a large C-based system that used this approach and it took a while to remove all the "lost" errors. We eventually developed some techniques to catch this problem (such as storing the error code in a thread-global location and checking at the top level to see that the returned error code matched the saved error code).
Exception handling is easier to code quickly because if you're writing code and you're not sure how to handle the error, you can just let it propagate upwards (assuming you're not using Java where you have to deal with checked exceptions). It's better for debugging because you can get a stack trace of where the exception occurred (and because you can build in top level exception handlers to catch problems that should have been caught elsewhere). It's better for maintaining because if you've done things right, you'll be notified of problem faster.
However, there are some design issues with exception handling, and if you get it wrong, you'll be worse off. In short, if you're writing code that you don't know how to handle an exception, you should let it propagate up. Too many coders trap errors just to convert the exception and then rethrow it, resulting in spaghetti exception code that sometimes loses information about the original cause of the problem. This assumes that you have exception handlers at the top level (entry points).
Personally when it comes to output graphics, I feel a silent fail is fine. It just makes your picture wrong.
Graphical errors are super easy to spot anyways.
Personally, I would add an errno to your Base struct if it is pure 'C'. If this is C++ I'd throw an exception.
It depends on how 'fatal' these errors are. Does the user really need to see the error, or is it for other developers edification?
For maintainability you need to clearly document the errors that can occur and include clear examples of error handling.