Should I use an expression parser in my Math game? - math

I'm writing some children's Math Education software for a class.
I'm going to try and present problems to students of varying skill level with randomly generated math problems of different types in fun ways.
One of the frustrations of using computer based math software is its rigidity. If anyone has taken an online Math class, you'll know all about the frustration of taking an online quiz and having your correct answer thrown out because your problem isn't exactly formatted in their form or some weird spacing issue.
So, originally I thought, "I know! I'll use an expression parser on the answer box so I'll be able to evaluate anything they enter and even if it isn't in the same form I'll be able to check if it is the same answer." So I fire up my IDE and start implementing the Shunting Yard Algorithm.
This would solve the problem of it not taking fractions in the smallest form and other issues.
However, It then hit me that a tricky student would simply be able to enter most of the problems into the answer box and my expression parser would dutifully parse and evaluate it to the correct answer!
So, should I not be using an expression parser in this instance? Do I really have to generate a single form of the answer and do a string comparison?

One possible solution is to note how many steps your expression evaluator takes to evaluate the problem's original expression, and to compare this to the optimal answer. If there's too much difference, then the problem hasn't been reduced enough and you can suggest that the student keep going.
Don't be surprised if students come up with better answers than your own definition of "optimal", though! I was a TA/grader for several classes, and the brightest students routinely had answers on their problem sets that were superior to the ones provided by the professor.

For simple problems where you're looking for an exact answer, then removing whitespace and doing a string compare is reasonable.
For more advanced problems, you might do the Shunting Yard Algorithm (or similar) but perhaps parametrize it so you could turn on/off reductions to guard against the tricky student. You'll notice that "simple" answers can still use the parser, but you would disable all reductions.
For example, on a division question, you'd disable the "/" reduction.

This is a great question.
If you are writing an expression system and an evaluation/transformation/equivalence engine (isn't there one available somewhere? I am almost 100% sure that there is an open source one somewhere), then it's more of an education/algebra problem: is the student's answer algebraically closer to the original expression or to the expected expression.
I'm not sure how to answer that, but just an idea (not necessarily practical): perhaps your evaluation engine can count transformation steps to equivalence. If the answer takes less steps to the expected than it did to the original, it might be ok. If it's too close to the original, it's not.

You could use an expression parser, but apply restrictions on the complexity of the expressions permitted in the answer.
For example, if the goal is to reduce (4/5)*(1/2) and you want to allow either (2/5) or (4/10), then you could restrict the set of allowable answers to expressions whose trees take the form (x/y) and which also evaluate to the correct number. Perhaps you would also allow "0.4", i.e. expressions of the form (x) which evaluate to the correct number.
This is exactly what you would (implicitly) be doing if you graded the problem manually -- you would be looking for an answer that is correct but which also falls into an acceptable class.

The usual way of doing this in mathematics assessment software is to allow the question setter to specify expressions/strings that are not allowed in a correct answer.
If you happen to be interested in existing software, there's the open-source Stack http://www.stack.bham.ac.uk/ (or various commercial options such as MapleTA). I suspect most of the problems that you'll come across have also been encountered by Stack so even if you don't want to use it, it might be educational to look at how it approaches things.

Related

h2o categorical_encoding understanding when to use and why

I'm trying to understand the pros/cons and when to use the various encoding options that are available to me in h2o with the parameter 'categorical_encoding'.
It would be helpful if people could point out general rules of thumb on how to use this.
Typically I use the 'Enum' value because I like how all categorical values are grouped together when looking at feature importance. On the other hand, xgboost's default value is 'label-encoder' I believe, which breaks things up by categorical level/value.
Unfortunately, I don't really know where to begin or questions to ask around these other values available:
one hot internal
one hot explicit
sort_by_response
enum_limited
enum
-label-encoder
Again, I primarily stick with enum, sometimes label-encoder, but honestly I don't know practical implications of these various options. Would love a generalized understanding of when one might be better than other from someone knowledgeable !
As requested (thanks!) this question was reposted to cross-validated. So the answer on what the pros and cons are can be found at: https://stats.stackexchange.com/questions/376203/categorical-encoding-in-h2o-what-is-the-difference-between-the-options

Is there a way to use arbitrary type of value as key in environment or named list in R?

I've been looking for a proper implementation of hash map in R, with functionalities similar to the map type in Python.
After some googling and searching the R documentations, I found that environment and named list are the ONLY options I can use (is that really so?).
But the problem with the two is that they can only take charaters as key for the hashing, not even a number, let alone other type of things.
So is there a way to use arbitrary things as key? or at least more than just characters.
Or is there a better implemtation of hash map that I didn't find with better functionalities ?
Thanks in advance.
Edit:
My current problem: I need a map to store the distance relationship between data points. That is, the key of the map is a tuple (p1, p2) and the value is a number.
The reason I asked a generic question instead of a concrete one is that I'm learning R recently and I want to know how to manipulate some of the most fundamental data structures, not only what my problem refers to. So I may need to use other things as key in the future, and I want to avoid asking similar questions with only minor difference every time I run into them.
Edit 2:
I got a lot of very good advices on this topic. It seems I'm still thinking quite in the Pythonic way, rather than the should-be R way. I should really get more R-ly ! I think my purpose can easily be satisfied by a matrix in R. Thanks All !
The reason people keep asking you for a specific example is that most problems for which hash tables are the appropriate technique in Python have a good solution in R that does not involve hash tables.
That said, there are certainly times when a real hash table is useful in R, and I recommend you check out the hash package for R. It uses environments as its base but lets you do a lot of R-like vector work with them. It's efficient and I've never run into a problem with it.
Just keep in mind that if you're using hash tables a lot while working with R and your code is running slowly or is buggy, you may be able to get some mileage from figuring out a more R-like way of doing it :)

Are there a set of "universal" error/exception codes?

I'm a bit of a polyglot when it comes to programming languages, and most of the languages I use have Error/Exception handling of some sort.
In most languages there's a default implementation of error ID's with their associated messages, but I've never found a list of production codes to base my own error codes off of.
Does such a thing exist?
If not would it be useful, or just noise that most programmers ignore?
The closest thing I can think of is POSIX error constants (though their numeric values are not standardized.)
Short answer - no, it doesn't exist. Every OS, platform and piece of software pretty much has its own error IDs. These are not synchronized or based on any standard set.
I would say that apart from the common errors, this would indeed just be noise, and even with the common one, one one need to standardize them and ensure they are used universally.

Algorithm to find if one document is included in another, when those two documents are similar

I'm looking for an algorithm that finds whether two text documents are similar, where one document is included in the other document.
I thank you in advance.
You can always use diff with diffstat. The diff documentation isn't precise about the algorithm(s) it uses, but the original authors wrote a paper about it (Google for diff paper), and you can always read the source code.
For more precise answers you will need a more precise question. Are you only interested to know whether one document is a fragment of the other document?
Or are you also interested in knowing whether one can be split up into pieces that each occur in the other document, in the same order? Or are you also interested to know how much material does not occur if you try to match up the material of both documents with a fast algorithm? diff will tell you all those things. Or do you want to know the absolute best matching? diff doesn't always give you that, you'll need something like Levenshtein distance. If one of the documents is much shorter than the other you can use fast string searching algorithms. Etc. Etc.

Surjective functions

As an extension question my lecturer for my maths in computer science module asked us to find examples of when a surjective function is vital to the operation of a system, he said he can't think of any!
I've been doing some googling and have only found a single outdated paper about non surjective rounding functions creating some flaws in some cryptographic systems.
Master edit:
[btw, thank you for the accepted response.]
In reviewing my response, and these of others in this post, I realized two things.
The first one is the fact that in looking things at a higher level of abstraction, most (all?) of the [counter-]examples provided are a form of "discretization" function. In other words, they correspond to the ubiquitous requirement in computer systems of mapping [possibly infinitely] numerous entities/values to a set (possibly "infinite" too, though most often a finite one) of discrete entities/values. While not all such mappings imply or require a non bijective surjection, many do, hence the several examples found.
The other observation is that the most compelling examples seem to be tied to stochastic (random) processes, or to the underlying primitives which support them.
Both of these things are quite telling, I think, for it mirrors, if only loosely, the way the real world's complexity (read "randomness", at many levels) is exploited in various systems in human (and animals) to produce simplified/stable/discrete maps that represent elements of this complex reality: Another case where mathematics and its practical-oriented friend, Computer Science, team up to describe or to mimic fundamental realities (or... are these realities? hum... we're getting too philosophical...)
It could be a matter of understanding exactly the frame of the question:
do bijective functions count (they are indeed a special case of a surjection)
Edit: No, bijective functions are not considered.
has it got to be a "function" in the sense of a procedural calculation as opposed to say a "relation" as in databases
Edit: yes, a procedural function of sorts... "take in a value and return another value" (IMHO this distinction is very tenuous as any "map" is a function, regardless of the inner working, but let's entertain this "numeric calculation like" restriction in the spirit of this question)
define "vital"...
With all these caveats in mind, the following may apply:
Elementary mathematical functions such as ABS() or even ROUND(), FLOOR() (absolute value, rouding of a decimal/float value to the nearest int respectively) etc.
In the case of ABS(), for example, used in the context of a program which draws shapes on the screen, using various properties of say symmetry, would be able to count on getting two, and exactly two values to map to a a given value, and to have all values in a given integral range (say from 0 to 10), to be an ABS() value, lest the drawings will start to look funny ;-)
the Soundex function (and its many derivations)
Modulo operations, even in such trivial uses as to show the status of a process, every x items processed.
Classification processes: it is both important that there'd be a important reduction factors (thousands instances mapped to a handful of categories), and it is vital [in some cases] that all instances yield one and only one category (ex: in real-time decision systems).
Various "simple" mathematical functions used in pseudo-random number generators.
It is vital that they'd be surjective, so that a) all values within the namespace would be reacheable, indeed, expectations of a specific, often uniform, distribution is implied. (Note, could be bit of a repeat of the "modulo" example above, although it doesn't have to use modulo arithmetic proper, other math function can do)
Following is a bad example, now that Martin clarified that [math operations like functions that] "take in a value and return another value" is what defines "function" hence disqualifying database/table-driven "maps" and such. And also that bijections were not considered either.
One-to-One relations (or one-to-many relations for that matter) : it can be so important to maintain these that we require triggers etc. to keep up with referential integrity
A very simple scheduler implemented by the function random(0, number of processes - 1) expects this function to be surjective, otherwise some processes will never run.
In practice the scheduler has some sort of internal state that it modifies. If you want to see it as a function in the mathematical sense, it takes a state and returns a new state and a process number to run, and in this context it's no longer important that it is surjective because not all possible states have to be reachable. Not a very good example, I'm afraid, but the only one I can think of.
A hashing function should ideally be surjective.
But in general I think the question is too vague to be answered. What is a system? What is a function used inside a system?
Edit:
I think the question is not very meaningful. After all there are many cases where you need to be able to produce every desired result. Just think about the identity function and imagine where you could argue that it is used:
using a reference to a variable in programming
using a text (or even hex-editor) to produce a file
It would be very bad, if you could not create any bit combination by xor or not when doing bit manipulations.

Resources