Related
I am suddenly in a recursive language class (sml) and recursion is not yet physically sensible for me. I'm thinking about the way a floor of square tiles is sometimes a model or metaphor for integer multiplication, or Cuisenaire Rods are a model or analogue for addition and subtraction. Does anyone have any such models you could share?
Imagine you're a real life magician, and can make a copy of yourself. You create your double a step closer to the goal and give him (or her) the same orders as you were given.
Your double does the same to his copy. He's a magician too, you see.
When the final copy finds itself created at the goal, it has nowhere more to go, so it reports back to its creator. Which does the same.
Eventually, you get your answer back – without having moved an inch – and can now create the final result from it, easily. You get to pretend not knowing about all those doubles doing the actual hard work for you. "Hmm," you're saying to yourself, "what if I were one step closer to the goal and already knew the result? Wouldn't it be easy to find the final answer then ?" (*)
Of course, if you were a double, you'd have to report your findings to your creator.
More here.
(also, I think I saw this "doubles" creation chain event here, though I'm not entirely sure).
(*) and that is the essence of the recursion method of problem solving.
How do I know my procedure is right? If my simple little combination step produces a valid solution, under assumption it produced the correct solution for the smaller case, all I need is to make sure it works for the smallest case – the base case – and then by induction the validity is proven!
Another possibility is divide-and-conquer, where we split our problem in two halves, so will get to the base case much much faster. As long as the combination step is simple (and preserves validity of solution of course), it works. In our magician metaphor, I get to create two copies of myself, and combine their two answers into one when they are finished. Each of them creates two copies of themselves as well, so this creates a branching tree of magicians, instead of a simple line as before.
A good example is the Sierpinski triangle which is a figure that is built from three quarter-sized Sierpinski triangles simply, by stacking them up at their corners.
Each of the three component triangles is built according to the same recipe.
Although it doesn't have the base case, and so the recursion is unbounded (bottomless; infinite), any finite representation of S.T. will presumably draw just a dot in place of the S.T. which is too small (serving as the base case, stopping the recursion).
There's a nice picture of it in the linked Wikipedia article.
Recursively drawing an S.T. without the size limit will never draw anything on screen! For mathematicians recursion may be great, engineers though should be more cautious about it. :)
Switching to corecursion ⁄ iteration (see the linked answer for that), we would first draw the outlines, and the interiors after that; so even without the size limit the picture would appear pretty quickly. The program would then be busy without any noticeable effect, but that's better than the empty screen.
I came across this piece from Edsger W. Dijkstra; he tells how his child grabbed recursions:
A few years later a five-year old son would show me how smoothly the idea of recursion comes to the unspoilt mind. Walking with me in the middle of town he suddenly remarked to me, Daddy, not every boat has a lifeboat, has it? I said How come? Well, the lifeboat could have a smaller lifeboat, but then that would be without one.
I love this question and couldn't resist to add an answer...
Recursion is the russian doll of programming. The first example that come to my mind is closer to an example of mutual recursion :
Mutual recursion everyday example
Mutual recursion is a particular case of recursion (but sometimes it's easier to understand from a particular case than from a generic one) when we have two function A and B defined like A calls B and B calls A. You can experiment this very easily using a webcam (it also works with 2 mirrors):
display the webcam output on your screen with VLC, or any software that can do it.
Point your webcam to the screen.
The screen will progressively display an infinite "vortex" of screen.
What happens ?
The webcam (A) capture the screen (B)
The screen display the image captured by the webcam (the screen itself).
The webcam capture the screen with a screen displayed on it.
The screen display that image (now there are two screens displayed)
And so on.
You finally end up with such an image (yes, my webcam is total crap):
"Simple" recursion is more or less the same except that there is only one actor (function) that calls itself (A calls A)
"Simple" Recursion
That's more or less the same answer as #WillNess but with a little code and some interactivity (using the js snippets of SO)
Let's say you are a very motivated gold-miner looking for gold, with a very tiny mine, so tiny that you can only look for gold vertically. And so you dig, and you check for gold. If you find some, you don't have to dig anymore, just take the gold and go. But if you don't, that means you have to dig deeper. So there are only two things that can stop you:
Finding some gold nugget.
The Earth's boiling kernel of melted iron.
So if you want to write this programmatically -using recursion-, that could be something like this :
// This function only generates a probability of 1/10
function checkForGold() {
let rnd = Math.round(Math.random() * 10);
return rnd === 1;
}
function digUntilYouFind() {
if (checkForGold()) {
return 1; // he found something, no need to dig deeper
}
// gold not found, digging deeper
return digUntilYouFind();
}
let gold = digUntilYouFind();
console.log(`${gold} nugget found`);
Or with a little more interactivity :
// This function only generates a probability of 1/10
function checkForGold() {
console.log("checking...");
let rnd = Math.round(Math.random() * 10);
return rnd === 1;
}
function digUntilYouFind() {
if (checkForGold()) {
console.log("OMG, I found something !")
return 1;
}
try {
console.log("digging...");
return digUntilYouFind();
} finally {
console.log("climbing back...");
}
}
let gold = digUntilYouFind();
console.log(`${gold} nugget found`);
If we don't find some gold, the digUntilYouFind function calls itself. When the miner "climbs back" from his mine it's actually the deepest child call to the function returning the gold nugget through all its parents (the call stack) until the value can be assigned to the gold variable.
Here the probability is high enough to avoid the miner to dig to the earth kernel. The earth kernel is to the miner what the stack size is to a program. When the miner comes to the kernel he dies in terrible pain, when the program exceed the stack size (causes a stack overflow), it crashes.
There are optimization that can be made by the compiler/interpreter to allow infinite level of recursion like tail-call optimization.
Take fractals as being recursive: the same pattern get applied each time, yet each figure differs from another.
As natural phenomena with fractal features, Wikipedia presents:
Moutain ranges
Frost crystals
DNA
and, even, proteins.
This is odd, and not quite a physical example except insofar as dance-movement is physical. It occurred to me the other morning. I call it "Written in Latin, solved in Hebrew." Huh? Surely you are saying "Huh?"
By it I mean that encoding a recursion is usually done left-to-right, in the Latin alphabet style: "Def fac(n) = n*(fac(n-1))." The movement style is "outermost case to base case."
But (please check me on this) at least in this simple case, it seems the easiest way to evaluate it is right-to-left, in the Hebrew alphabet style: Start from the base case and move outward to the outermost case:
(fac(0) = 1)
(fac(1) = 1)*(fac(0) = 1)
(fac(2))*(fac(1) = 1)*(fac(0) = 1)
(fac(n)*(fac(n-1)*...*(fac(2))*(fac(1) = 1)*(fac(0) = 1)
(* Easier order to calculate <<<<<<<<<<< is leftwards,
base outwards to outermost case;
more difficult order to calculate >>>>>> is rightwards,
outermost case to base *)
Then you do not have to suspend items on the left while awaiting the results of calculations further right. "Dance Leftwards" instead of "Dance rightwards"?
I'm trying to get my head around these integration methods and I'm thouroughly confused.
Here is the code:
public void update_euler(float timeDelta){
vPos.y += vVelocity.y * timeDelta;
vVelocity.y += gravity.y * timeDelta;
}
public void update_nsv(float timeDelta){
vVelocity.y += gravity.y*timeDelta;
vPos.y += vVelocity.y * timeDelta;
}
public void onDrawFrame(GL10 gl) {
currentTime = System.currentTimeMillis();
float timeDelta = currentTime - startTime;
startTime = currentTime;
timeDelta *= 1.0f/1000;;
// update_RK4(timeDelta);
// update_nsv(timeDelta);
// update_euler(timeDelta);
// update_velocity_verlet(timeDelta);
}
Firstly, I just want to make sure I've got these right.
I am simulating a perfectly elastic ball bouncing, so on the bounce I just reverse the velocity.
The Euler method, the ball bounces higher on each bounce. Is this due to an error in my code or is this due to the innacuracy of the method. I've read that with the Euler integration you lose energy over time. Well I'm gaining it and I don't know why.
The nsv method: I don't quite understand how this is different to the Eular method, but in any case the ball bounces lower on each bounce. It is losing energy which I've read isnt meant to happen with the nsv method. Why am I losing energy?
(The velocity verlet and RK4 methods are working as I'd expect them to).
I get the impression I'm lacking a fundamental bit of information on this subject, but I don't know what.
I do realise my timestep is lacking, and updating it to run the physics using a static timestep would stop me losing/gaining energy, but I am trying to understand what is going on.
Any help would be appreciated.
To add another option to #Beta's answer, if you average the two methods, your error should disappear (except for issues around handling the actual bounce).
public void update_avg(float timeDelta){
vVelocity.y += gravity.y*timeDelta/2;
vPos.y += vVelocity.y * timeDelta;
vVelocity.y += gravity.y*timeDelta/2;
}
What I'm doing here is updating the velocity to the average velocity over the interval, then updating the position based on that velocity, then updating the velocity to the velocity at the end of the interval.
If you have a more complicated scenario that you want to model, consider using the Runge-Kutta Method to solve differential equations of the form y' = f(x, y). (Note that here y can be a set of different variables. So in your case you'd have d(position, velocity)/dt = (velocity, -gravity). And the code I gave you works out to be the same as the second-order version of that method.
In real life, the ball moves upward and decelerates, reaches the apex (apogee) where its velocity is zero for a split-second, then moves downward and accelerates. Over any time interval it is exchanging kinetic energy (being fast) with potential energy (being high).
In the Euler method, it moves with constant velocity for the duration of the interval, then at the end of the interval it suddenly changes its velocity. So on the upward journey it goes up at high speed, then slows down, having gained more altitude than it should have. On the downward leg it creeps down slowly, losing little altitude, then speeds up.
In the nsv method, the opposite happens: on the way up it loses speed "too soon" and doesn't get very high, on the way down it hurries and reaches the ground without building up much speed.
The two methods are the same in the limit as timeDelta goes to zero. (If that statement made no sense, don't sweat it, it's just calculus.) If you make timeDelta small, the effect should fade. Or you could use energy as your primary variable, not {position, velocity}, but the math would be a little more complicated.
The integration introduces artificial damping into the system. I believe you can determine ho much by doing a Fourier analysis on the integration scheme, but I'd have to refresh my memory on the details.
I have balls bouncing around and each time they collide their speed vector is reduced by the Coefficient of Restitution.
Right now my balls CoR for my balls is .80 . So after many bounces my balls have "stopped" rolling because their speed has becoming some ridiculously small number.
In what stage is it appropriate to check if a speed value is small enough to simply call it zero (so I don't have the crazy jittering of the balls reacting to their micro-velocities). I've read on some forums before that people will sometimes use an epsilon constant, some small number and check against that.
Should I define an epsilon constant and do something like:
if Math.abs(velocity.x) < epsilon then velocity.x = 0
Each time I update the balls velocity and position? Is this what is generally done? Would it be reasonable to place that in my Vector classes setters for x and y? Or should I do it outside of my vector class when I'm calculating the velocities.
Also, what would be a reasonable epsilon value if I was using floats for my speed vector?
A reasonable value for epsilon is going to depend on the constraints of your system. If you are representing the ball graphically, then your epsilon might correspond to, say, a velocity of .1 pixels a second (ensuring that your notion of stopping matches the user's experience of the screen objects stopping). If you're doing a physics simulation, you'll want to tune it to the accuracy to which you're trying to measure your system.
As for how often you check - that depends as well. If you're simulating something in real time, the extra check might be costly, and you'll want to check every 10 updates or once per second or something. Or performance might not be an issue, and you can check with every update.
Instead of an epsilon for an IsStillMoving function, maybe you could use an UpdatePosition function, scheduled on an object-by-object basis based on its velocity.
I'd do something like this (in my own make-it-up-as-you-go pseudocode):
void UpdatePosition(Ball b) {
TimeStamp now = Clock.GetTime();
float secondsSinceLastUpdate = now.TimeSince(b.LastUpdate).InSeconds;
Point3D oldPosition = b.Position;
Point3D newPosition = CalculatePosition(b.Position, b.Velocity, interval);
b.MoveTo(newPosition);
float epsilonOfAccuracy = 0.5; // Accurate to one half-pixel
float pixelDistance = Camera.PixelDistance(oldPosition, newPosition);
float fps = System.CurrentFramesPerSecond;
float secondsToMoveOnePixel = (pixelDistance * secondsSinceLastUpdate) / fps;
float nextUpdateInterval = secondsToMoveOnePixel / epsilonOfAccuracy;
b.SetNextUpdateAt(now + nextUpdateInterval);
}
Balls moving very quickly would get updated on every frame. Balls moving more slowly might update every five or ten frames. And balls that have stopped (or nearly stopped) would update only very very rarely.
IMO your epsilon approach is fine. I would just experiment to see what looks or feels natural to the animation in the game.
Epsilon by nature is the smallest possible increment. Unfortunately, computers have different "minimal" increments of their own depending on the floating point representation. I would be very careful (and might even go higher than what I would calculate just for safety) playing around with that, especially if I want a code to be portable.
You may want to write a function that figures out the minimal increment on your floats rather than use a magic value.
I'm working on a 2D game where I'm trying to accelerate an object to a top speed using some basic physics code.
Here's the pseudocode for it:
const float acceleration = 0.02f;
const float friction = 0.8f; // value is always 0.0..1.0
float velocity = 0;
float position = 0;
move()
{
velocity += acceleration;
velocity *= friction;
position += velocity;
}
This is a very simplified approach that doesn't rely on mass or actual friction (the in-code friction is just a generic force acting against movement). It works well as the "velocity *= friction;" part keeps the velocity from going past a certain point. However, it's this top speed and its relationship to the acceleration and friction where I'm a bit lost.
What I'd like to do is set a top speed, and the amount of time it takes to reach it, then use them to derive the acceleration and friction values.
i.e.,
const float max_velocity = 2.0;
const int ticks; = 120; // If my game runs at 60 FPS, I'd like a
// moving object to reach max_velocity in
// exactly 2 seconds.
const float acceleration = ?
const float friction = ?
I found this question very interesting since I had recently done some work on modeling projectile motion with drag.
Point 1: You are essentially updating the position and velocity using an explicit/forward Euler iteration where each new value for the states should be a function of the old values. In such a case, you should be updating the position first, then updating the velocity.
Point 2: There are more realistic physics models for the effect of drag friction. One model (suggested by Adam Liss) involves a drag force that is proportional to the velocity (known as Stokes' drag, which generally applies to low velocity situations). The one I previously suggested involves a drag force that is proportional to the square of the velocity (known as quadratic drag, which generally applies to high velocity situations). I'll address each one with regard to how you would deduce formulas for the maximum velocity and the time required to effectively reach the maximum velocity. I'll forego the complete derivations since they are rather involved.
Stokes' drag:
The equation for updating the velocity would be:
velocity += acceleration - friction*velocity
which represents the following differential equation:
dv/dt = a - f*v
Using the first entry in this integral table, we can find the solution (assuming v = 0 at t = 0):
v = (a/f) - (a/f)*exp(-f*t)
The maximum (i.e. terminal) velocity occurs when t >> 0, so that the second term in the equation is very close to zero and:
v_max = a/f
Regarding the time needed to reach the maximum velocity, note that the equation never truly reaches it, but instead asymptotes towards it. However, when the argument of the exponential equals -5, the velocity is around 98% of the maximum velocity, probably close enough to consider it equal. You can then approximate the time to maximum velocity as:
t_max = 5/f
You can then use these two equations to solve for f and a given a desired vmax and tmax.
Quadratic drag:
The equation for updating the velocity would be:
velocity += acceleration - friction*velocity*velocity
which represents the following differential equation:
dv/dt = a - f*v^2
Using the first entry in this integral table, we can find the solution (assuming v = 0 at t = 0):
v = sqrt(a/f)*(exp(2*sqrt(a*f)*t) - 1)/(exp(2*sqrt(a*f)*t) + 1)
The maximum (i.e. terminal) velocity occurs when t >> 0, so that the exponential terms are much greater than 1 and the equation approaches:
v_max = sqrt(a/f)
Regarding the time needed to reach the maximum velocity, note that the equation never truly reaches it, but instead asymptotes towards it. However, when the argument of the exponential equals 5, the velocity is around 99% of the maximum velocity, probably close enough to consider it equal. You can then approximate the time to maximum velocity as:
t_max = 2.5/sqrt(a*f)
which is also equivalent to:
t_max = 2.5/(f*v_max)
For a desired vmax and tmax, the second equation for tmax will tell you what f should be, and then you can plug that in to the equation for vmax to get the value for a.
This seems like a bit of overkill, but these are actually some of the simplest ways to model drag! Anyone who really wants to see the integration steps can shoot me an email and I'll send them to you. They are a bit too involved to type here.
Another Point: I didn't immediately realize this, but the updating of the velocity is not necessary anymore if you instead use the formulas I derived for v(t). If you are simply modeling acceleration from rest, and you are keeping track of the time since the acceleration began, the code would look something like:
position += velocity_function(timeSinceStart)
where "velocity_function" is one of the two formulas for v(t) and you would no longer need a velocity variable. In general, there is a trade-off here: calculating v(t) may be more computationally expensive than simply updating velocity with an iterative scheme (due to the exponential terms), but it is guaranteed to remain stable and bounded. Under certain conditions (like trying to get a very short tmax), the iteration can become unstable and blow-up, a common problem with the forward Euler method. However, maintaining limits on the variables (like 0 < f < 1), should prevent these instabilities.
In addition, if you're feeling somewhat masochistic, you may be able to integrate the formula for v(t) to get a closed form solution for p(t), thus foregoing the need for a Newton iteration altogether. I'll leave this for others to attempt. =)
Warning: Partial Solution
If we follow the physics as stated, there is no maximum velocity. From a purely physical viewpoint, you've fixed the acceleration at a constant value, which means the velocity is always increasing.
As an alternative, consider the two forces acting on your object:
The constant external force, F, that tends to accelerate it, and
The force of drag, d, which is proportional to the velocity and tends to slow it down.
So the velocity at iteration n becomes: vn = v0 + n F - dvn-1
You've asked to choose the maximum velocity, vnmax, that occurs at iteration nmax.
Note that the problem is under-constrained; that is, F and d are related, so you can arbitrarily choose a value for one of them, then calculate the other.
Now that the ball's rolling, is anyone willing to pick up the math?
Warning: it's ugly and involves power series!
Edit: Why doe the sequence n**F** in the first equation appear literally unless there's a space after the n?
velocity *= friction;
This doesn't prevent the velocity from going about a certain point...
Friction increases exponentially (don't quote me on that) as the velocity increases, and will be 0 at rest. Eventually, you will reach a point where friction = acceleration.
So you want something like this:
velocity += (acceleration - friction);
position += velocity;
friction = a*exp(b*velocity);
Where you pick values for a and b. b will control how long it takes to reach top speed, and a will control how abruptly the friction increases. (Again, don't do your own research on this- I'm going from what I remember from grade 12 physics.)
This isn't answering your question, but one thing you shouldn't do in simulations like this is depend on a fixed frame rate. Calculate the time since the last update, and use the delta-T in your equations. Something like:
static double lastUpdate=0;
if (lastUpdate!=0) {
deltaT = time() - lastUpdate;
velocity += acceleration * deltaT;
position += velocity * deltaT;
}
lastUpdate = time();
It's also good to check if you lose focus and stop updating, and when you gain focus set lastUpdate to 0. That way you don't get a huge deltaT to process when you get back.
If you want to see what can be done with very simple physics models using very simple maths, take a look at some of the Scratch projects at http://scratch.mit.edu/ - you may get some useful ideas & you'll certainly have fun.
This is probably not what you are looking for but depending on what engine you are working on, it might be better to use a engine built by some one else, like farseer(for C#).
Note Codeplex is down for maintenance.
First off, this question is ripped out from this question. I did it because I think this part is bigger than a sub-part of a longer question. If it offends, please pardon me.
Assume that you have a algorithm that generates randomness. Now how do you test it?
Or to be more direct - Assume you have an algorithm that shuffles a deck of cards, how do you test that it's a perfectly random algorithm?
To add some theory to the problem -
A deck of cards can be shuffled in 52! (52 factorial) different ways. Take a deck of cards, shuffle it by hand and write down the order of all cards. What is the probability that you would have gotten exactly that shuffle? Answer: 1 / 52!.
What is the chance that you, after shuffling, will get A, K, Q, J ... of each suit in a sequence? Answer 1 / 52!
So, just shuffling once and looking at the result will give you absolutely no information about your shuffling algorithms randomness. Twice and you have more information, Three even more...
How would you black box test a shuffling algorithm for randomness?
Statistics. The de facto standard for testing RNGs is the Diehard suite (originally available at http://stat.fsu.edu/pub/diehard). Alternatively, the Ent program provides tests that are simpler to interpret but less comprehensive.
As for shuffling algorithms, use a well-known algorithm such as Fisher-Yates (a.k.a "Knuth Shuffle"). The shuffle will be uniformly random so long as the underlying RNG is uniformly random. If you are using Java, this algorithm is available in the standard library (see Collections.shuffle).
It probably doesn't matter for most applications, but be aware that most RNGs do not provide sufficient degrees of freedom to produce every possible permutation of a 52-card deck (explained here).
Here's one simple check that you can perform. It uses generated random numbers to estimate Pi. It's not proof of randomness, but poor RNGs typically don't do well on it (they will return something like 2.5 or 3.8 rather ~3.14).
Ideally this would be just one of many tests that you would run to check randomness.
Something else that you can check is the standard deviation of the output. The expected standard deviation for a uniformly distributed population of values in the range 0..n approaches n/sqrt(12).
/**
* This is a rudimentary check to ensure that the output of a given RNG
* is approximately uniformly distributed. If the RNG output is not
* uniformly distributed, this method will return a poor estimate for the
* value of pi.
* #param rng The RNG to test.
* #param iterations The number of random points to generate for use in the
* calculation. This value needs to be sufficiently large in order to
* produce a reasonably accurate result (assuming the RNG is uniform).
* Less than 10,000 is not particularly useful. 100,000 should be sufficient.
* #return An approximation of pi generated using the provided RNG.
*/
public static double calculateMonteCarloValueForPi(Random rng,
int iterations)
{
// Assumes a quadrant of a circle of radius 1, bounded by a box with
// sides of length 1. The area of the square is therefore 1 square unit
// and the area of the quadrant is (pi * r^2) / 4.
int totalInsideQuadrant = 0;
// Generate the specified number of random points and count how many fall
// within the quadrant and how many do not. We expect the number of points
// in the quadrant (expressed as a fraction of the total number of points)
// to be pi/4. Therefore pi = 4 * ratio.
for (int i = 0; i < iterations; i++)
{
double x = rng.nextDouble();
double y = rng.nextDouble();
if (isInQuadrant(x, y))
{
++totalInsideQuadrant;
}
}
// From these figures we can deduce an approximate value for Pi.
return 4 * ((double) totalInsideQuadrant / iterations);
}
/**
* Uses Pythagoras' theorem to determine whether the specified coordinates
* fall within the area of the quadrant of a circle of radius 1 that is
* centered on the origin.
* #param x The x-coordinate of the point (must be between 0 and 1).
* #param y The y-coordinate of the point (must be between 0 and 1).
* #return True if the point is within the quadrant, false otherwise.
*/
private static boolean isInQuadrant(double x, double y)
{
double distance = Math.sqrt((x * x) + (y * y));
return distance <= 1;
}
First, it is impossible to know for sure if a certain finite output is "truly random" since, as you point out, any output is possible.
What can be done, is to take a sequence of outputs and check various measurements of this sequence against what is more likely. You can derive a sort of confidence score that the generating algorithm is doing a good job.
For example, you could check the output of 10 different shuffles. Assign a number 0-51 to each card, and take the average of the card in position 6 across the shuffles. The convergent average is 25.5, so you would be surprised to see a value of 1 here. You could use the central limit theorem to get an estimate of how likely each average is for a given position.
But we shouldn't stop here! Because this algorithm could be fooled by a system that only alternates between two shuffles that are designed to give the exact average of 25.5 at each position. How can we do better?
We expect a uniform distribution (equal likelihood for any given card) at each position, across different shuffles. So among the 10 shuffles, we could try to verify that the choices 'look uniform.' This is basically just a reduced version of the original problem. You could check that the standard deviation looks reasonable, that the min is reasonable, and the max value as well. You could also check that other values, such as the closest two cards (by our assigned numbers), also make sense.
But we also can't just add various measurements like this ad infinitum, since, given enough statistics, any particular shuffle will appear highly unlikely for some reason (e.g. this is one of very few shuffles in which cards X,Y,Z appear in order). So the big question is: which is the right set of measurements to take? Here I have to admit that I don't know the best answer. However, if you have a certain application in mind, you can choose a good set of properties/measurements to test, and work with those -- this seems to be the way cryptographers handle things.
There's a lot of theory on testing randomness. For a very simple test on a card shuffling algorithm you could do a lot of shuffles and then run a chi squared test that the probability of each card turning up in any position was uniform. But that doesn't test that consecutive cards aren't correlated so you would also want to do tests on that.
Volume 2 of Knuth's Art of Computer Programming gives a number of tests that you could use in sections 3.3.2 (Empirical tests) and 3.3.4 (The Spectral Test) and the theory behind them.
The only way to test for randomness is to write a program that attempts to build a predictive model for the data being tested, and then use that model to try to predict future data, and then showing that the uncertainty, or entropy, of its predictions tend towards maximum (i.e. the uniform distribution) over time. Of course, you'll always be uncertain whether or not your model has captured all of the necessary context; given a model, it'll always be possible to build a second model that generates non-random data that looks random to the first. But as long as you accept that the orbit of Pluto has an insignificant influence on the results of the shuffling algorithm, then you should be able to satisfy yourself that its results are acceptably random.
Of course, if you do this, you might as well use your model generatively, to actually create the data you want. And if you do that, then you're back at square one.
Shuffle alot, and then record the outcomes (if im reading this correctly). I remember seeing comparisons of "random number generators". They just test it over and over, then graph the results.
If it is truly random the graph will be mostly even.
I'm not fully following your question. You say
Assume that you have a algorithm that generates randomness. Now how do you test it?
What do you mean? If you're assuming you can generate randomness, there's no need to test it.
Once you have a good random number generator, creating a random permutation is easy (e.g. Call your cards 1-52. Generate 52 random numbers assigning each one to a card in order, and then sort according to your 52 randoms) . You're not going to destroy the randomness of your good RNG by generating your permutation.
The difficult question is whether you can trust your RNG. Here's a sample link to people discussing that issue in a specific context.
Testing 52! possibilities is of course impossible. Instead, try your shuffle on smaller numbers of cards, like 3, 5, and 10. Then you can test billions of shuffles and use a histogram and the chi-square statistical test to prove that each permutation is coming up an "even" number of times.
No code so far, therefore I copy-paste a testing part from my answer to the original question.
// ...
int main() {
typedef std::map<std::pair<size_t, Deck::value_type>, size_t> Map;
Map freqs;
Deck d;
const size_t ntests = 100000;
// compute frequencies of events: card at position
for (size_t i = 0; i < ntests; ++i) {
d.shuffle();
size_t pos = 0;
for(Deck::const_iterator j = d.begin(); j != d.end(); ++j, ++pos)
++freqs[std::make_pair(pos, *j)];
}
// if Deck.shuffle() is correct then all frequencies must be similar
for (Map::const_iterator j = freqs.begin(); j != freqs.end(); ++j)
std::cout << "pos=" << j->first.first << " card=" << j->first.second
<< " freq=" << j->second << std::endl;
}
This code does not test randomness of underlying pseudorandom number generator. Testing PRNG randomness is a whole branch of science.
For a quick test, you can always try compressing it. Once it doesn't compress, then you can move onto other tests.
I've tried dieharder but it refuses to work for a shuffle. All tests fail. It is also really stodgy, it wont let you specify the range of values you want or anything like that.
Pondering it myself, what I would do is something like:
Setup (Pseudo code)
// A card has a Number 0-51 and a position 0-51
int[][] StatMatrix = new int[52][52]; // Assume all are set to 0 as starting values
ShuffleCards();
ForEach (card in Cards) {
StatMatrix[Card.Position][Card.Number]++;
}
This gives us a matrix 52x52 indicating how many times a card has ended up at a certain position. Repeat this a large number of times (I would start with 1000, but people better at statistics than me may give a better number).
Analyze the matrix
If we have perfect randomness and perform the shuffle an infinite number of times then for each card and for each position the number of times the card ended up in that position is the same as for any other card. Saying the same thing in a different way:
statMatrix[position][card] / numberOfShuffle = 1/52.
So I would calculate how far from that number we are.