Math library design decision: throw exceptions or fail silently

Math library design decision: throw exceptions or fail silently - math

Say you're designing a math library (in JS) from scratch: the usual Vector2/3/4, Matrix2/3/4, Quaternion and so on (standard stuff for WebGL apps). What would be the best way to handle bad input? (division by zero, inverting a singular matrix, computing the intersection point between 2 parallel lines and so on).
The two ways to deal with this would be to:
throw exceptions
I know there are plenty who like to know precisely when their code fails - sort of the same people who hate dynamic typing, but I can't help but think of the dreaded "Error 200: Division by zero" exceptions that I got so much of in my early days of programming years ago. The only solution was to sprinkle the code with checks to prevent any of these errors. That only made code UGLY. I also can't help but wonder why programming languages nowadays have adopted +/-Infinity and NaN.
or to fail silently
In this case, the possible scenarios when trying to execute the line:
singularMatrix.invert().add(otherMatrix)
would be:
singularMatrix.invert() would return BAD_MATRIX, and BAD_MATRIX.add() would do nothing (and "stop the computation" (JQuery-like))
singularMatrix.invert() would fail but return itself unchanged and .add() would work
singularMatrix.invert() would fill the matrix with +/-Infinity and the computation would continue
I would personally prefer one of the latter options, but I'm totally open to arguments and alternatives (that's why I'm asking here on SO).
I don't know if the "best way" for this sort of thing has been invented yet.

Whatever you do, don't fail silently. There's no point in continuing the computation if the result is going to be wrong, and you don't want to show an incorrect result to the user and claim that it's correct. Nothing good can come of that, especially in a reusable library where you don't necessarily know what the caller will do with the result.
Throw an exception or return a special value that the caller can check for, such as undefined.

NaN and status codes (option 3)
The IEEE754 standard was created to resolve a lot of these issue in a completely consistent way. For example, 1/0 == +inf, which is a kind of NaN. This standard is baked into processors themselves. It is neither a thrown exception (which would make some simple code very complex) nor a silent failure. You can trace the NaNs all the way back to where they appeared, giving you the debug information you need to fix the bug.
As far as large routines like matrix inversion goes, numerical libraries generally follow the unix convention of returning a status code. In Javascript you can do this by returning an object with a status property.
Taking your example:
singularMatrix.invert().add(otherMatrix)
If invert were to return a matrix object full of NaNs with the status property 'invalid matrix', then add can be called and return another matrix full of NaNs.
This permits you to call invert and later check whether it was valid; if you use exceptions you have to handle them immediately, and when you want to defer a decision until later, you'll have to set up the same set of properties.
There is still useful information in a matrix that is partially or entirely filled with NaNs - the shape information can be used to create a new matrix to replace a bad initial vector, or known good values can still be used in calculation.
TLDR: Do the NaN thing, and recreate it in matrices.

Related

How to check for missing partials

I implemented a system that is composed of few groups and multiple components. It is relatively intricate and has component inputs/outputs, which some partials are dependent/non dependent etc.
Gradient based optimizers seem to get stuck at the initial values and never go further than iteration 0 (not stuck at local optimum). I have encountered this error before as I was missing declare_partials for some variables. Is there a way to automatically check which component input/output is missing partials similar to missing connection in N^2 diagram.

There are two tools that you need to use to check for bad derivatives. The first is check_partials. That will go component by component and use either finite-difference or complex-step to verify the partial derivatives for every component (regardless of whether or not your declared them in the setup of that component). That will catch the problem if you are missing any partials, because the check-fd will see them as non-zero and will show you that there is an error.
Check_partials should be your first stop, always. If you can, use complex-step to verify your derivatives. That way you know they are totally accurate. Also, check_partials will do the check around whatever point is currently initialized. So sometimes you might have a degenerate case (e.g. you have some input that is 0) and so your check_passes, but your derivatives are still wrong. For example, if your component represented y=2*x, and you forgot to define derivatives, but you ran check_partials at x=0, then the check would pass. But if you ran it at x=1, then the check would show an error.
If all of your partial derivatives are correct, but you're still getting bad results then you can try check_totals. Depending on the structure of your model, and if you have any coupling in it (i.e. you need to use some kind of nonlinear solver) then its possible that you don't have a correctly configured linear solver setup to solve for total derivatives correctly. In a lot of cases, if you have coupling you can just put a DirectSolver right at the same level as the nonlinear solver you put in the model.

Is recursion a feature in and of itself?

...or is it just a practice?
I'm asking this because of an argument with my professor: I lost credit for calling a function recursively on the basis that we did not cover recursion in class, and my argument is that we learned it implicitly by learning return and methods.
I'm asking here because I suspect someone has a definitive answer.
For example, what is the difference between the following two methods:
public static void a() {
return a();
}
public static void b() {
return a();
}
Other than "a continues forever" (in the actual program it is used correctly to prompt a user again when provided with invalid input), is there any fundamental difference between a and b? To an un-optimized compiler, how are they handled differently?
Ultimately it comes down to whether by learning to return a() from b that we therefor also learned to return a() from a. Did we?

To answer your specific question: No, from the standpoint of learning a language, recursion isn't a feature. If your professor really docked you marks for using a "feature" he hadn't taught yet, that was wrong.
Reading between the lines, one possibility is that by using recursion, you avoided ever using a feature that was supposed to be a learning outcome for his course. For example, maybe you didn't use iteration at all, or maybe you only used for loops instead of using both for and while. It's common that an assignment aims to test your ability to do certain things, and if you avoid doing them, your professor simply can't grant you the marks set aside for that feature. However, if that really was the cause of your lost marks, the professor should take this as a learning experience of his or her own- if demonstrating certain learning outcomes is one of the criteria for an assignment, that should be clearly explained to the students.
Having said that, I agree with most of the other comments and answers that iteration is a better choice than recursion here. There are a couple of reasons, and while other people have touched on them to some extent, I'm not sure they've fully explained the thought behind them.
Stack Overflows
The more obvious one is that you risk getting a stack overflow error. Realistically, the method you wrote is very unlikely to actually lead to one, since a user would have to give incorrect input many many times to actually trigger a stack overflow.
However, one thing to keep in mind is that not just the method itself, but other methods higher or lower in the call chain will be on the stack. Because of this, casually gobbling up available stack space is a pretty impolite thing for any method to do. Nobody wants to have to constantly worry about free stack space whenever they write code because of the risk that other code might have needlessly used a lot of it up.
This is part of a more general principle in software design called abstraction. Essentially, when you call DoThing(), all you should need to care about is that Thing is done. You shouldn't have to worry about the implementation details of how it's done. But greedy use of the stack breaks this principle, because every bit of code has to worry about how much stack it can safely assume it has left to it by code elsewhere in the call chain.
Readability
The other reason is readability. The ideal that code should aspire to is to be a human-readable document, where each line describes simply what it's doing. Take these two approaches:
private int getInput() {
int input;
do {
input = promptForInput();
} while (!inputIsValid(input))
return input;
}
versus
private int getInput() {
int input = promptForInput();
if(inputIsValid(input)) {
return input;
}
return getInput();
}
Yes, these both work, and yes they're both pretty easy to understand. But how might the two approaches be described in English? I think it'd be something like:
I will prompt for input until the input is valid, and then return it
versus
I will prompt for input, then if the input is valid I will return it, otherwise I get the input and return the result of that instead
Perhaps you can think of slightly less clunky wording for the latter, but I think you'll always find that the first one is going to be a more accurate description, conceptually, of what you are actually trying to do. This isn't to say recursion is always less readable. For situations where it shines, like tree traversal, you could do the same kind of side by side analysis between recursion and another approach and you'd almost certainly find recursion gives code which is more clearly self-describing, line by line.
In isolation, both of these are small points. It's very unlikely this would ever really lead to a stack overflow, and the gain in readability is minor. But any program is going to be a collection of many of these small decisions, so even if in isolation they don't matter much, it's important to learn the principles behind getting them right.

To answer the literal question, rather than the meta-question: recursion is a feature, in the sense that not all compilers and/or languages necessarily permit it. In practice, it is expected of all (ordinary) modern compilers - and certainly all Java compilers! - but it is not universally true.
As a contrived example of why recursion might not be supported, consider a compiler that stores the return address for a function in a static location; this might be the case, for example, for a compiler for a microprocessor that does not have a stack.
For such a compiler, when you call a function like this
a();
it is implemented as
move the address of label 1 to variable return_from_a
jump to label function_a
label 1
and the definition of a(),
function a()
{
var1 = 5;
return;
}
is implemented as
label function_a
move 5 to variable var1
jump to the address stored in variable return_from_a
Hopefully the problem when you try to call a() recursively in such a compiler is obvious; the compiler no longer knows how to return from the outer call, because the return address has been overwritten.
For the compiler I actually used (late 70s or early 80s, I think) with no support for recursion the problem was slightly more subtle than that: the return address would be stored on the stack, just like in modern compilers, but local variables weren't. (Theoretically this should mean that recursion was possible for functions with no non-static local variables, but I don't remember whether the compiler explicitly supported that or not. It may have needed implicit local variables for some reason.)
Looking forwards, I can imagine specialized scenarios - heavily parallel systems, perhaps - where not having to provide a stack for every thread could be advantageous, and where therefore recursion is only permitted if the compiler can refactor it into a loop. (Of course the primitive compilers I discuss above were not capable of complicated tasks like refactoring code.)

The teacher wants to know whether you have studied or not. Apparently you didn't solve the problem the way he taught you (the good way; iteration), and thus, considers that you didn't. I'm all for creative solutions but in this case I have to agree with your teacher for a different reason: If the user provides invalid input too many times (i.e. by keeping enter pressed), you'll have a stack overflow exception and your solution will crash. In addition, the iterative solution is more efficient and easier to maintain. I think that's the reason your teacher should have given you.

Deducting points because "we didn't cover recursion in class" is awful. If you learnt how to call function A which calls function B which calls function C which returns back to B which returns back to A which returns back to the caller, and the teacher didn't tell you explicitly that these must be different functions (which would be the case in old FORTRAN versions, for example), there is no reason that A, B and C cannot all be the same function.
On the other hand, we'd have to see the actual code to decide whether in your particular case using recursion is really the right thing to do. There are not many details, but it does sound wrong.

There are many point of views to look at regarding the specific question you asked but what I can say is that from the standpoint of learning a language, recursion isn't a feature on its own. If your professor really docked you marks for using a "feature" he hadn't taught yet, that was wrong but like I said, there are other point of views to consider here which actually make the professor being right when deducting points.
From what I can deduce from your question, using a recursive function to ask for input in case of input failure is not a good practice since every recursive functions' call gets pushed on to the stack. Since this recursion is driven by user input it is possible to have an infinite recursive function and thus resulting in a StackOverflow.
There is no difference between these 2 examples you mentioned in your question in the sense of what they do (but do differ in other ways)- In both cases, a return address and all method info is being loaded to the stack. In a recursion case, the return address is simply the line right after the method calling (of course its not exactly what you see in the code itself, but rather in the code the compiler created). In Java, C, and Python, recursion is fairly expensive compared to iteration (in general) because it requires the allocation of a new stack frame. Not to mention you can get a stack overflow exception if the input is not valid too many times.
I believe the professor deducted points since recursion is considered a subject of its own and its unlikely that someone with no programming experience would think of recursion. (Of course it doesn't mean they won't, but it's unlikely).
IMHO, I think the professor is right by deducting you the points. You could have easily taken the validation part to a different method and use it like this:
public bool foo()
{
validInput = GetInput();
while(!validInput)
{
MessageBox.Show("Wrong Input, please try again!");
validInput = GetInput();
}
return hasWon(x, y, piece);
}
If what you did can indeed be solved in that manner then what you did was a bad practice and should be avoided.

Maybe your professor hasn't taught it yet, but it sounds like you're ready to learn the advantages and disadvantages of recursion.
The main advantage of recursion is that recursive algorithms are often much easier and quicker to write.
The main disadvantage of recursion is that recursive algorithms can cause stack overflows, since each level of recursion requires an additional stack frame to be added to the stack.
For production code, where scaling can result in many more levels of recursion in production than in the programmer's unit tests, the disadvantage usually outweighs the advantage, and recursive code is often avoided when practical.

Regarding the specific question, is recursion a feature, I'm inclined to say yes, but after re-interpreting the question. There are common design choices of languages and compilers that make recursion possible, and Turing-complete languages do exist that don't allow recursion at all. In other words, recursion is an ability that is enabled by certain choices in language/compiler design.
Supporting first-class functions makes recursion possible under very minimal assumptions; see writing loops in Unlambda for an example, or this obtuse Python expression containing no self-references, loops or assignments:
>>> map((lambda x: lambda f: x(lambda g: f(lambda v: g(g)(v))))(
... lambda c: c(c))(lambda R: lambda n: 1 if n < 2 else n * R(n - 1)),
... xrange(10))
[1, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880]
Languages/compilers that use late binding, or that define forward declarations, make recursion possible. For example, while Python allows the below code, that's a design choice (late binding), not a requirement for a Turing-complete system. Mutually recursive functions often depend on support for forward declarations.
factorial = lambda n: 1 if n < 2 else n * factorial(n-1)
Statically typed languages that allow recursively defined types contribute to enabling recursion. See this implementation of the Y Combinator in Go. Without recursively-defined types, it would still be possible to use recursion in Go, but I believe the Y combinator specifically would be impossible.

From what I can deduce from your question, using a recursive function to ask for input in case of input failure is not a good practice. Why?
Because every recursive functions call gets pushed on to the stack. Since this recursion is driven by user input it is possible to have an infinite recursive function and thus resulting in a StackOverflow :-p
Having a non recursive loop to do this is the way to go.

Recursion is a programming concept, a feature (like iteration), and a practice. As you can see from the link, there's a large domain of research dedicated to the subject. Perhaps we don't need to go that deep in the topic to understand these points.
Recursion as a feature
In plain terms, Java supports it implicitly, because it allows a method (which is basically a special function) to have "knowledge" of itself and of others methods composing the class it belongs to. Consider a language where this is not the case: you would be able to write the body of that method a, but you wouldn't be able to include a call to a within it. The only solution would be to use iteration to obtain the same result. In such a language, you would have to make a distinction between functions aware of their own existence (by using a specific syntax token), and those who don't! Actually, a whole group of languages do make that distinction (see the Lisp and ML families for instance). Interestingly, Perl does even allow anonymous functions (so called lambdas) to call themselves recursively (again, with a dedicated syntax).
no recursion?
For languages which don't even support the possibility of recursion, there is often another solution, in the form of the Fixed-point combinator, but it still requires the language to support functions as so called first class objects (i.e. objects which may be manipulated within the language itself).
Recursion as a practice
Having that feature available in a language doesn't necessary mean that it is idiomatic. In Java 8, lambda expressions have been included, so it might become easier to adopt a functional approach to programming. However, there are practical considerations:
the syntax is still not very recursion friendly
compilers may not be able to detect that practice and optimize it
The bottom line
Luckily (or more accurately, for ease of use), Java does let methods be aware of themselves by default, and thus support recursion, so this isn't really a practical problem, but it still remain a theoretical one, and I suppose that your teacher wanted to address it specifically. Besides, in the light of the recent evolution of the language, it might turn into something important in the future.

if and elsif optimisation in Ada 95

I'm working on an Ada based project that I'm not terribly familiar with, and I've just seen something that at first glance seems inefficient, but of course that all depends on what the compiler might do.
if Ada.Strings.Fixed.Trim
(Source => Test_String,
Side => Ada.Strings.Both) =
String_1 then
--Do something here
elsif Ada.Strings.Fixed.Trim
(Source => Test_String,
Side => Ada.Strings.Both) =
String_2 then
--Do something else here
end if;
I feel that it would be more efficient to call the Trim procedure and store the result in a String variable, then test against different Strings in each condition of the if statement, especially if there are many conditions to check (never mind that using a binary search might be even better). Of course, I may be wrong, so my question is, is there something about compile time optimisation in Ada that I do not know about, that might cause the Trim function to only be called once, and only have the result tested in each condition of the if statement?

That would be compiler-dependent, not language-dependent. Certainly, GNAT GPL 2013 calls Trim twice both at -O2 and at -O3.
Your (and my) intuition seems to be right: do the trim once and store the result ...
Trimmed : constant String :=
Ada.Strings.Fixed.Trim (Source => Test_String, Side => Ada.Strings.Both);
... though personally I’d write
Trimmed : constant String :=
Ada.Strings.Fixed.Trim (Test_String, Side => Ada.Strings.Both);
on the grounds that in this case no one should need the named parameter association to clarify the programmer’s intention!

I don't think we need to worry about efficiency for this case. But the coding style that do need to be improved. To define a variable to hold the result of the Trim function is a better practice in my opinion. The program logic is clearer and easier for maintenance.
Let's say you want to change the Trim function to another one or change the parameter passing in, you only need to change one place. Although for this case there are only two places, you still have a chance to error. If there are more cases to test, then there definitely will be missing case.

While I have no disagreement whatsoever with your assessment or Simon's answer, it's worth bearing in mind what optimisation actually means...
If each call to "trim" takes (conservatively) 0.1ms of CPU time, and the rewrite takes 2 minutes, the breakeven point in time saved is over 1 million executions of that particular statement!
This of course ignores (a) on the positive side, the value of gaining experience, and (b) on the negative side, the fact that CPU time is nowadays less valuable than your time. And (c) the time we have spent discussing the optimisation too!

With respect to "compile time optimisation in Ada" , the Ada language doesn't say anything about this. All it would say here is that the program must behave as if the function were called twice, i.e. it must produce the same results. But if the compiler "knows" that the function would produce the exact same result the second time, it can generate code that calls it only once. It can't do this for a function call in general, because a function could include side effects or use global variables that whose values could have changed. In this case, since the effects of Ada.Strings.Fixed.Trim are defined by the language and since those effects do guarantee that the effects would be the same, a compiler could, in theory, mark this function as one for which a call with the exact same parameters could be optimized. (A function in a Pure package would definitely be one where a second call could be eliminated, but unfortunately Ada.Strings.Fixed isn't defined by the Ada language as being Pure.) To find out whether a compiler actually does this particular optimization, though, you'd have to try it and check the code. I believe that if Test_String is not marked as Volatile, then compilers are allowed to assume that its value will be the same for each Ada.Strings.Fixed.Trim call (i.e. it can't be changed by another task in between the calls), but I'm not 100% sure about this.
I'd declare a constant to hold the result (like Simon), but for me this is more out of a desire to avoid duplicated code than it is out of a concern for efficiency.

How does functional programming avoid state when it seems unavoidable?

Let's say we define a function c sum(a, b), functional programming -style, that returns the sum of its arguments. So far so good; all the nice things of FP without any problems.
Now let's say we run this in an environment with dynamic typing and a singleton, stateful error stream. Then let's say we pass a value of a and/or b that sum isn't designed to handle (i.e. not numbers), and it needs to indicate an error somehow.
But how? This function is supposed to be pure and side-effect-less. How does it insert an error into the global error stream without violating that?

No programming language that I know of has anything like a "singleton stateful error stream" built in, so you'd have to make one. And you simply wouldn't make such a thing if you were trying to write your program in a pure functional style.
You could, however, have a sum function that returns either the sum or an indication of an error. The type used to do this is in fact often known by the name Either. Then you could easily make a function that invokes a whole bunch of computations that could possibly return an error, and returns a list of all the errors that were encountered in the other computations. That's pretty close to what you were talking about; it's just explicitly returned rather than being global.
Remember, the question when you're writing a functional program is "how do I make a program that has the behavior I want?" not, "how would I duplicate one particular approach taken in another programming style?". A "global stateful error stream" is a means not an end. You can't have a global stateful error stream in pure function style, no. But ask yourself what you're using the global stateful error stream to achieve; whatever it is, you can achieve that in functional programming, just not with the same mechanism.
Asking whether pure functional programming can implement a particular technique that depends on side effects is like asking how you use techniques from assembly in object-oriented programming. OO provides different tools for you to use to solve problems; limiting yourself to using those tools to emulate a different toolset is not going to be an effective way to work with them.
In response to comments: If what you want to achieve with your error stream is logging error messages to a terminal, then yes, at some level the code is going to have to do IO to do that.1
Printing to terminal is just like any other IO, there's nothing particularly special about it that makes it worthy of singling out as a case where state seems especially unavoidable. So if this turns your question into "How do pure functional programs handle IO?", then there are no doubt many duplicate questions on SO, not to mention many many blog posts and tutorials speaking precisely to that issue. It's not like it's a sudden surprise to implementors and users of pure programming languages, the question has been around for decades, and there have been some quite sophisticated thought put into the answers.
There are different approaches taken in different languages (IO monad in Haskell, unique modes in Mercury, lazy streams of requests and responses in historical versions of Haskell, and more). The basic idea is to come up with a model which can be manipulated by pure code, and hook up manipulations of the model to actual impure operations within the language implementation. This allows you to keep the benefits of purity (the proofs that apply to pure code but not to general impure code will still apply to code using the pure IO model).
The pure model has to be carefully designed so that you can't actually do anything with it that doesn't make sense in terms of actual IO. For example, Mercury does IO by having you write programs as if you're passing around the current state of the universe as an extra parameter. This pure model accurately represents the behaviour of operations that depend on and affect the universe outside the program, but only when there is exactly one state of the universe in the system at any one time, which is threaded through the entire program from start to finish. So some restrictions are put in
The type io is made abstract so that there's no way to construct a value of that type; the only way you can get one is to be passed one from your caller. An io value is passed into the main predicate by the language implementation to kick the whole thing off.
The mode of the io value passed in to main is declared such that it is unique. This means you can't do things that might cause it to be duplicated, such as putting it in a container or passing the same io value to multiple different invocations. The unique mode ensures that you can only ass the io value to a predicate that also uses the unique mode, and as soon as you pass it once the value is "dead" and can't be passed anywhere else.
1 Note that even in imperative programs, you gain a lot of flexibility if you have your error logging system return a stream of error messages and then only actually make the decision to print them close to the outermost layer of the program. If your log calls are directly writing the output immediately, here's just a few things I can think of off the top of my head that become much harder to do with such a system:
Speculatively execute a computation and see whether it failed by checking whether it emitted any errors
Combine multiple high level systems into a single system, adding tags to the logs to distinguish each system
Emit debug and info log messages only if there is also an error message (so the output is clean when there are no errors to debug, and rich in detail when there are)

Should I use an expression parser in my Math game?

I'm writing some children's Math Education software for a class.
I'm going to try and present problems to students of varying skill level with randomly generated math problems of different types in fun ways.
One of the frustrations of using computer based math software is its rigidity. If anyone has taken an online Math class, you'll know all about the frustration of taking an online quiz and having your correct answer thrown out because your problem isn't exactly formatted in their form or some weird spacing issue.
So, originally I thought, "I know! I'll use an expression parser on the answer box so I'll be able to evaluate anything they enter and even if it isn't in the same form I'll be able to check if it is the same answer." So I fire up my IDE and start implementing the Shunting Yard Algorithm.
This would solve the problem of it not taking fractions in the smallest form and other issues.
However, It then hit me that a tricky student would simply be able to enter most of the problems into the answer box and my expression parser would dutifully parse and evaluate it to the correct answer!
So, should I not be using an expression parser in this instance? Do I really have to generate a single form of the answer and do a string comparison?

One possible solution is to note how many steps your expression evaluator takes to evaluate the problem's original expression, and to compare this to the optimal answer. If there's too much difference, then the problem hasn't been reduced enough and you can suggest that the student keep going.
Don't be surprised if students come up with better answers than your own definition of "optimal", though! I was a TA/grader for several classes, and the brightest students routinely had answers on their problem sets that were superior to the ones provided by the professor.

For simple problems where you're looking for an exact answer, then removing whitespace and doing a string compare is reasonable.
For more advanced problems, you might do the Shunting Yard Algorithm (or similar) but perhaps parametrize it so you could turn on/off reductions to guard against the tricky student. You'll notice that "simple" answers can still use the parser, but you would disable all reductions.
For example, on a division question, you'd disable the "/" reduction.

This is a great question.
If you are writing an expression system and an evaluation/transformation/equivalence engine (isn't there one available somewhere? I am almost 100% sure that there is an open source one somewhere), then it's more of an education/algebra problem: is the student's answer algebraically closer to the original expression or to the expected expression.
I'm not sure how to answer that, but just an idea (not necessarily practical): perhaps your evaluation engine can count transformation steps to equivalence. If the answer takes less steps to the expected than it did to the original, it might be ok. If it's too close to the original, it's not.

You could use an expression parser, but apply restrictions on the complexity of the expressions permitted in the answer.
For example, if the goal is to reduce (4/5)*(1/2) and you want to allow either (2/5) or (4/10), then you could restrict the set of allowable answers to expressions whose trees take the form (x/y) and which also evaluate to the correct number. Perhaps you would also allow "0.4", i.e. expressions of the form (x) which evaluate to the correct number.
This is exactly what you would (implicitly) be doing if you graded the problem manually -- you would be looking for an answer that is correct but which also falls into an acceptable class.

The usual way of doing this in mathematics assessment software is to allow the question setter to specify expressions/strings that are not allowed in a correct answer.
If you happen to be interested in existing software, there's the open-source Stack http://www.stack.bham.ac.uk/ (or various commercial options such as MapleTA). I suspect most of the problems that you'll come across have also been encountered by Stack so even if you don't want to use it, it might be educational to look at how it approaches things.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex