How big should epsilon be when checking if dot product is close to 0? - math

How big should epsilon be when checking if dot product is close to 0?
I am working on raytracing project, and i need to check if the dot product is 0. But
that will probably never happen, so I want to take it as 0 if its value is in a small
area [-eps, +eps], but I am not sure how big should eps be?
Thanks

Since you describe this as part of a ray-tracing project, your accuracy needed is likely dictated by the "world coordinates" of a scene, or perhaps even the screen coordinates to which those are translated. Either of these would give an acceptable target for absolute error in your calculation.
It might be possible to backtrack from that to the accuracy required in the intermediate calculations you are doing, such as forming an inner product which is supposed "theoretically" to be zero. For example, you might be trying to find a shortest path (reflected light) between two smooth bodies, and the disappearance of the inner product (perpendicularity) gives the location of a point.
In a case like that the inner product may be a quadratic in the unknowns (location of a point) that you seek. It's possible that the unknowns form a "double root" (zero of multiplicity 2), making the location of that root extra sensitive to the computation of the inner product being zero.
For such cases you would want to get roughly twice the number of digits "zero" in the inner product as needed in the accuracy of the location. Essentially the inner product changes very slowly with location in the neighborhood of a double root.
But your application might not be so sensitive; analysis of the algorithm involved is necessary to give you a good answer. As a general rule I do the inner product in double precision to get answers that may be reliable as far as single precision, but this may be too costly if the ray-tracing is to be done in real time.

There is no definitive answer. I use two approaches.
If you are only concerned by floating point error then you can use a pretty small value, comparable to the smallest floating number that the compiler can handle. In c/c++ you can use the definitions provided in float.h, such as DBL_MIN to check for these numbers. I'd use a small multiple of the number, e.g., 10. * DBL_MIN as the value for eps.
If the problem is not floating point math rounding error, then I use a value that is small (say 1%) compared to the modulus of the smallest vector.

Related

Find the first root and local maximum/minimum of a function

Problem
I want to find
The first root
The first local minimum/maximum
of a black-box function in a given range.
The function has following properties:
It's continuous and differentiable.
It's combination of constant and periodic functions. All periods are known.
(It's better if it can be done with weaker assumptions)
What is the fastest way to get the root and the extremum?
Do I need more assumptions or bounds of the function?
What I've tried
I know I can use root-finding algorithm. What I don't know is how to find the first root efficiently.
It needs to be fast enough so that it can run within a few miliseconds with precision of 1.0 and range of 1.0e+8, which is the problem.
Since the range could be quite large and it should be precise enough, I can't brute-force it by checking all the possible subranges.
I considered bisection method, but it's too slow to find the first root if the function has only one big root in the range, as every subrange should be checked.
It's preferable if the solution is in java, but any similar language is fine.
Background
I want to calculate when arbitrary celestial object reaches certain height.
It's a configuration-defined virtual object, so I can't assume anything about the object.
It's not easy to get either analytical solution or simple approximation because various coordinates are involved.
I decided to find a numerical solution for this.
For a general black box function, this can't really be done. Any root finding algorithm on a black box function can't guarantee that it has found all the roots or any particular root, even if the function is continuous and differentiable.
The property of being periodic gives a bit more hope, but you can still have periodic functions with infinitely many roots in a bounded domain. Given that your function relates to celestial objects, this isn't likely to happen. Assuming your periodic functions are sinusoidal, I believe you can get away with checking subranges on the order of one-quarter of the shortest period (out of all the periodic components).
Maybe try Brent's Method on the shortest quarter period subranges?
Another approach would be to apply your root finding algorithm iteratively. If your range is (a, b), then apply your algorithm to that range to find a root at say c < b. Then apply your algorithm to the range (a, c) to find a root in that range. Continue until no more roots are found. The last root you found is a good candidate for your minimum root.
Black box function for any range? You cannot even be sure it has the continuous domain over that range. What kind of solutions are you looking for? Natural numbers, integers, real numbers, complex? These are all the question that greatly impact the answer.
So 1st thing should be determining what kind of number you accept as the result.
Second is having some kind of protection against limes of function that will try to explode your calculations as it goes for plus or minus infinity.
Since we are touching the limes topics you could have your solution edge towards zero and look like a solution but never touch 0 and become a solution. This depends on your margin of error, how close something has to be to be considered ok, it's good enough.
I think for this your SIMPLEST TO IMPLEMENT bet for real number solutions (I assume those) is to take an interval and this divide and conquer algorithm:
Take lower and upper border and middle value (or approx middle value for infinity decimals border/borders)
Try to calculate solution with all 3 and have some kind of protection against infinities
remember all 3 values in an array with results from them (3 pair of values)
remember the current best value (one its closest to solution) in seperate variable (a pair of value and result for that value)
STEP FORWARD - repeat above with 1st -2nd value range and 2nd -3rd value range
have a new pair of value and result to be closest to solution.
clear the old value-result pairs, replace them with new ones gotten from this iteration while remembering the best value solution pair (total)
Repeat above for how precise you wish to get and look at that memory explode with each iteration, keep in mind you are gonna to have exponential growth of values there. It can be further improved if you lets say take one interval and go as deep as you wanna, remember best value-result pair and then delete all other memory and go for next interval and dig deep.

Tile Based A* Pathfinding, but with a bomb

I've written a simple A* path finding algorithm to quickly find a way through a tile based dungeon in which the tiles contain the information of walls.
An example of a dungeon (only 1 path for simplicity):
However now I'd like to add a variable amount of "Bombs" to the algorithm which would allow the path-finding to ignore 1 wall. However now it doesn't find the best paths anymore,
for example with use of only 1 bomb the generated path looks like the first image here:
Edit: actually it would look like this: https://i.stack.imgur.com/kPoAA.png
While the correct path would be the second image
The problem is that "Closed Nodes" now interfere with possible paths. Any ideas of how to tackle this problem would be greatly appreciated!
Your "game state" will no longer only be defined by your location, but also by an integer representing the number of bombs you have left. If you're following the pseudocode of A* on wikipedia, this means you cannot simply implement the closedSet as a grid of booleans. It should probably be implemented as, for example, a hash map / hash set, where every entry holds the following data:
x coordinate
y coordinate
number of bombs left
By visiting a certain position in the search process, you'll no longer mark just that position as closed. You'll mark the combination of position + number of bombs left as closed. That way, if later on in the same search process you run into a position where you're at the same location, but have more bombs left, you will not ignore it as closed but will actually continue searching that possibility.
Note that, if the maximum possible number of bombs is relatively low, you could also implement the closedSet as an array of boolean grids, where you first index by number of bombs, then by x and y coordinates to find out if a specific position is closed or not.
Doesn't this mean that you just pretend to not have any walls at all?
Use A* to find the shortest path from start to end and then check how many walls you'd have to go through. If you have enough bombs, you can use the path. Otherwise, try the next longest path and so on.
By the way: you might want to check http://gamedev.stackexchange.com for questions like this one.
You need to tweak the cost function to cost something for a bomb, then run the algorithm normally with an infinite cost for a second bomb. To get the bomb approximately halfway, play about with cost function, it should probably cost about the heuristic A-B distance times the cost for an empty tile. If you have two bombs, half the cost and of course then use of three bombs costs infinite.
But don't expect very good results. A* isn't designed for that kind of optimisation.

Maths - How to plot a course to a point with constrained final velocity?

First of all a disclaimer: I'm posting this question here even if I realize it is quite maths heavy, because I have trouble figuring out on what other site it could belong.
I'm writing a 2d spaceships game where the player will have to select the ship's destination and a course will be automatically plotted.
Along with this, I'm offering various options to control the ship's acceleration while it gets there. All these options have to do with the target velocity at the destination.
One option is to select the desired destination and velocity vector, in which case the program will use cubic interpolation, since starting and target coordinates and velocity are available.
Another option is to just select the destination point, but let the game calculate the final velocity vector. This is done through quadratic interpolation (ie. acceleration is constant).
I would like to introduce another option: let the player select the destination and the maximum absolute value of the velocity vector, as in
sqrt( vf_x^2 + vf_y^2 ) <= Vf_max
I take I shall use a 3rd order polynomial to model the course in this case, but I'm having a pretty hard time figuring out the coefficients, since I miss one equation for each coordinate (the one given by velocity at destination). Furthermore, I'm confused as to how I should use the Vf_max constraint to help me figure out the missing coefficients.
I suspect it could be an optimization problem, but I'm quite ignorant about this topic.
Can anybody help me find a solution or point me in the right direction, please?

Statistical best fit for gesture detection

I have a linear regression equation from school , which gives a value between 1 and -1 indicative of whether or not a set of data points are close enough to a linear function
and the equation given here
http://people.hofstra.edu/stefan_waner/realworld/calctopic1/regression.html
under best fit of a line. I would like to use these to do simple gesture detection based on a point in 3-space (x,y,z) - forward, back, left, right, up, down. First I would see if they fall on a line in 2 of the 3 dimensions, then I would see if that line's slope approached zero or infinity.
Is this fast enough for functional gesture recognition? If not, could someone propose an alternative algorithm?
If I've understood your question correctly then (1) the calculation you describe here would probably be plenty fast enough, (2) it may not actually do what you want, and (3) the stuff that'll be slow in an actual implementation would lie elsewhere.
So, I think you're proposing to do this. (1) Identify the positions of ... something ... (the user's hand, perhaps) in three-dimensional space, at several successive times. (2) For (say) each of {x,y} and {x,z}, look at those two coordinates of each point, compute the correlation coefficient (which is what your formula describes) and see whether it's close to +-1. (3) If both correlation coefficients are close to +-1 then the points lie approximately on a straight line; calculate the gradient of that line (using a formula similar to that of the correlation coefficient). (4) If the gradients are both very close to 0 or +- infinity, then your line is approximately parallel to one axis, which is the case you're trying to recognize.
1: Is it fast enough? You might perhaps be sampling at 50 frames per second or thereabouts, and your gestures might take a second to execute. So you'll have somewhere on the order of 50 positions. So, the total number of arithmetic operations you'll need is maybe a few hundred (including a modest number of square roots). In the worst case, you might be doing this in emulated floating-point on a slow ARM processor or something; in that case, each arithmetic operation might take a couple of hundred cycles, so the whole thing might be 100k cycles, which for a really slow processor running at 100MHz would be about a millisecond. You're not going to have any problem with the time taken to do this calculation.
2: Is it the right thing? It's not clear that it's the right calculation. For instance, suppose your user's hand moves back and forth rapidly several times along the x-axis; that will give you a positive result; is that what you want? Suppose the user attempts the gesture you want but moves at slightly the wrong angle; you may get a negative result. Suppose they move exactly along the x-axis for a bit and then along the y-axis for a bit; then the projections onto the {x,y}, {x,z} and {y,z} planes will all pass your test. These all seem like results you might not want.
3: Is it where the real cost will lie? This all assumes you've already got (x,y,z) coordinates. Getting those is probably going to be more expensive than processing them. For instance, if you have a camera-based system of some kind then there'll be some nontrivial image processing for every frame. Or perhaps you're integrating up data from accelerometers (which, by the way, is likely to give nasty inaccurate position results); the chances are that you're doing some filtering and other calculations to get position data. I bet that the cost of performing a calculation like this one will be substantially less than the cost of getting the coordinates in the first place.

When is it appropriate to use floating precision data types?

It's clear that one shouldn't use floating precision when working with, say, monetary amounts since the variation in precision leads to inaccuracies when doing calculations with that amount.
That said, what are use cases when that is acceptable? And, what are the general principles one should have in mind when deciding?
Floating point numbers should be used for what they were designed for: computations where what you want is a fixed precision, and you only care that your answer is accurate to within a certain tolerance. If you need an exact answer in all cases, you're best using something else.
Here are three domains where you might use floating point:
Scientific Simulations
Science apps require a lot of number crunching, and often use sophisticated numerical methods to solve systems of differential equations. You're typically talking double-precision floating point here.
Games
Think of games as a simulation where it's ok to cheat. If the physics is "good enough" to seem real then it's ok for games, and you can make up in user experience what you're missing in terms of accuracy. Games usually use single-precision floating point.
Stats
Like science apps, statistical methods need a lot of floating point. A lot of the numerical methods are the same; the application domain is just different. You find a lot of statistics and monte carlo simulations in financial applications and in any field where you're analyzing a lot of survey data.
Floating point isn't trivial, and for most business applications you really don't need to know all these subtleties. You're fine just knowing that you can't represent some decimal numbers exactly in floating point, and that you should be sure to use some decimal type for prices and things like that.
If you really want to get into the details and understand all the tradeoffs and pitfalls, check out the classic What Every Programmer Should Know About Floating Point, or pick up a book on Numerical Analysis or Applied Numerical Linear Algebra if you're really adventurous.
I'm guessing you mean "floating point" here. The answer is, basically, any time the quantities involved are approximate, measured, rather than precise; any time the quantities involved are larger than can be conveniently represented precisely on the underlying machine; any time the need for computational speed overwhelms exact precision; and any time the appropriate precision can be maintained without other complexities.
For more details of this, you really need to read a numerical analysis book.
Short story is that if you need exact calculations, DO NOT USE floating point.
Don't use floating point numbers as loop indices: Don't get caught doing:
for ( d = 0.1; d < 1.0; d+=0.1)
{ /* Some Code... */ }
You will be surprised.
Don't use floating point numbers as keys to any sort of map because you can never count on equality behaving like you may expect.
Most real-world quantities are inexact, and typically we know their numeric properties with a lot less precision than a typical floating-point value. In almost all cases, the C types float and double are good enough.
It is necessary to know some of the pitfalls. For example, testing two floating-point numbers for equality is usually not what you want, since all it takes is a single bit of inaccuracy to make the comparison non-equal. tgamblin has provided some good references.
The usual exception is money, which is calculated exactly according to certain conventions that don't translate well to binary representations. Part of this is the constants used: you'll never see a pi% interest rate, or a 22/7% interest rate, but you might well see a 3.14% interest rate. In other words, the numbers used are typically expressed in exact decimal fractions, not all of which are exact binary fractions. Further, the rounding in calculations is governed by conventions that also don't translate well into binary. This makes it extremely difficult to precisely duplicate financial calculations with standard floating point, and therefore people use other methods for them.
It's appropriate to use floating point types when dealing with scientific or statistical calculations. These will invariably only have, say, 3-8 significant digits of accuracy.
As to whether to use single or double precision floating point types, this depends on your need for accuracy and how many significant digits you need. Typically though people just end up using doubles unless they have a good reason not to.
For example if you measure distance or weight or any physical quantity like that the number you come up with isn't exact: it has a certain number of significant digits based on the accuracy of your instruments and your measurements.
For calculations involving anything like this, floating point numbers are appropriate.
Also, if you're dealing with irrational numbers floating point types are appropriate (and really your only choice) eg linear algebra where you deal with square roots a lot.
Money is different because you typically need to be exact and every digit is significant.
I think you should ask the other way around: when should you not use floating point. For most numerical tasks, floating point is the preferred data type, as you can (almost) forget about overflow and other kind of problems typically encountered with integer types.
One way to look at floating point data type is that the precision is independent of the dynamic, that is whether the number is very small of very big (within an acceptable range of course), the number of meaningful digits is approximately the same.
One drawback is that floating point numbers have some surprising properties, like x == x can be False (if x is nan), they do not follow most mathematical rules (distributivity, that is x( y + z) != xy + xz). Depending on the values for z, y, and z, this can matters.
From Wikipedia:
Floating-point arithmetic is at its
best when it is simply being used to
measure real-world quantities over a
wide range of scales (such as the
orbital period of Io or the mass of
the proton), and at its worst when it
is expected to model the interactions
of quantities expressed as decimal
strings that are expected to be exact.
Floating point is fast but inexact. If that is an acceptable trade off, use floating point.

Resources