Expected lifetime of the mouse in this Markov chain model - julia

I was reading the cat and mouse Markov model on wikipedia, and decided to write some Julia code to empirically confirm the analytical results:
P = [
0 0 0.5 0 0.5 ;
0 0 1 0 0 ;
0.25 0.25 0 0.25 0.25;
0 0 0.5 0 0.5 ;
0 0 0 0 1
]
prob_states = transpose([0.0, 1, 0, 0, 0])
prob_end = [0.0]
for i in 1:2000
prob_states = prob_states * P
prob_end_new = (1 - sum(prob_end)) * prob_states[end]
push!(prob_end, prob_end_new)
println("Ending probability: ", prob_end_new)
println("Cumulative: ", sum(prob_end))
end
println("Expected lifetime: ", sum(prob_end .* Array(1:2001)))
Here P is the transition matrix, prob_states is the probability distribution of the states at each iteration, prob_end is an array of probilities of termination at each step (e.g. prob_end[3] is probability of termination at step 3).
According to the result of this script, the expected lifetime of the mouse is about 4.3, while the analytical result is 4.5. The script makes sense to me so I really don't know where it could have gone wrong. Could anyone help?
P.S. Increasing the number of iteration by an order of magnitude almost changes nothing.

The probability of the mouse surviving approaches zero very quickly. This is not only unfortunate for the mouse, but also unfortunate for us as we cannot use 64-bit floating point numbers (which Julia is using here by default) to accurately approximate these tiny values of survival time.
In fact, most of the values prob_end are identically zero after a relatively low number of iterations, but evaluated analytically these values should be not-quite zero. The Float64 type simply cannot represent such small positive numbers.
This is why multiplying and summing the arrays never quite gets to 4.5; steps which should nudge the sum closer to this value fail cannot make the contribution as they are equal to zero. We see convergence to the lower value instead.
Using a different type which can represent arbitrarily tiny positive values, is a possibility, maybe. There are some suggestions here but you may find them very slow and memory-heavy when performing anything more than a few hundred iterations of this Markov chain model.
Another solution could be to convert the code to work with log probabilities instead (which are often used to overcome exactly this limitation of floating point numbers).

If you just want to empirically confirm the result, you can simulate the model directly:
const first_index = 1
const last_index = 5
const cat_start = 2
const mouse_start = 4
function move(i)
if i == first_index
return first_index + 1
elseif i == last_index
return last_index - 1
else
return i + rand([-1,1])
end
end
function step(cat, mouse)
return move(cat), move(mouse)
end
function game(cat, mouse)
i = 1
while cat != mouse
cat, mouse = step(cat, mouse)
i += 1
end
return i
end
function run()
res = [game(cat_start, mouse_start) for i=1:10_000_000]
return mean(res), std(res)/sqrt(length(res))
end
μ,σ = run()
println("Mean lifetime: $μ ± $σ")
Example output:
Mean lifetime: 4.5004993 ± 0.0009083568998918751

Related

Wave prediction after a FFT with a certain phase and frequency

I am using a sliding window to extract information from my EEG data with a FFT. Now I want to predict the signal from my window into the next one. So I extract the phase from a 0.25 second time window to predict for the next 0.25 second long window.
I am new to signal-processing/prediction, so my knowledge here is a little rusty.
I am not able to generate a sine wave with my extracted phase and frequency. I am just not finding a solution. I might just need a push into the right direction, who knows.
Is there a function in R to help me generate a suitable sine wave?
So I have my maximum Frequency with the phase extracted and need to generate a wave with this information.
here is pseudo-code to synthesize a sin curve of chosen frequency ... currently it assumes an initial seed phase shift of zero so just alter the theta value if you need a different initial phase shift
func pop_audio_buffer(number_of_samples float64, given_freq float64,
samples_per_second float64) ([]float64, error) {
// output sinusoidal curve is assured to both start and stop at the zero cross over threshold,
// independent of supplied input parms which control samples per cycle and buffer size.
// This avoids that "pop" which otherwise happens when rendering audio curves
// which begins at say 0.5 of a possible range -1 to 0 to +1
int_number_of_samples := int(number_of_samples)
if int_number_of_samples == 0 {
panic("ERROR - seeing 0 number_of_samples in pop_audio_buffer ... float number_of_samples " +
FloatToString(number_of_samples) + " is your desired_num_seconds too small ? " +
" or maybe too low value of sample rate")
}
source_buffer := make([]float64, int_number_of_samples)
incr_theta := (2.0 * math.Pi * given_freq) / samples_per_second
theta := 0.0
for curr_sample := 0; curr_sample < int_number_of_samples; curr_sample++ {
source_buffer[curr_sample] = math.Sin(theta)
theta += incr_theta
}
return source_buffer, nil
} // pop_audio_buffer

Need help interpreting this code

So this code was handed out in our school, its one of many examples (model of rolling a fair dice).
x<-runif(1)
y<-as.double(x<=c(1/6,2/6,3/6,4/6,5/6,1))*(1:6)
x<-min(y[y>0])
Im having trouble understanding the relation of this code and rolling a dice.
So the first line generates 1 randomly uniform distributed number x between 0 and 1.
In the second line we put a condition into x: If its less than 1 of the components of the vector (1/6,2/6,3/6,4/6,5/6,1) we get a TRUE=1 , else FALSE=0.
And then this result is multiplied by the vector (1,2,3,4,5,6).
Lastly we take the minimum value of that vector product (has to be greater than zero).
I cant get the intuition behind this. Would someone here mind to explain the relation of this code to rolling a dice in real life. Im confused..
So for rolling a dice each number has the same probability of 1/6 to appear.
Now what is done here, is to simulate rolling a dice.
Therefore in the first line a random number between 0 and 1 is generated.
The intervals it is compared to are all equally sized and they have a length of 1/6.
There for, for x to lie in one of these intervals the probability is again 1/6.
So what is done then in the third line, is to look up in which interval x has fallen.
Lets do an example:
Supposed x is 0.25.
Then the vector of the second line would look like this:
FALSE, TRUE, TRUE, TRUE , TRUE, TRUE
With the multiplication you get:
0, 2, 3, 4, 5, 6
Therefore at the end x is equal to 2.
So at the end x is supposed to be the number the dice is showing
Benjamin is right basically your code is saying
If runif(1) = 0 to 1/6 = die roll = 1
runif(1) = 1/6 to 2/6 = die roll = 2
runif(1) = 2/6 to 3/6 = die roll = 3
etc
Because the second line returns a vector of results the x<-min(y[y>0]) just returns the first positive which is your die roll

Interview: random3 function implementation using random2

On recent interview I was asked the following question. There is a function random2(), wich returns 0 or 1 with equal probability (0.5). Write implementation of random4() and random3() using random2().
It was easy to implement random4() like this
if(random2())
return random2();
return random2() + 2;
But I had difficulties with random3(). The only realization I could represent:
uint32_t sum = 0;
for (uint32_t i = 0; i != N; ++i)
sum += random2();
return sum % 3;
This implementation of random4() is based only my intuition only. I'm not sure if it is correct actually, because I can't mathematically prove its correctness. Can somebody help me with this question, please.
random3:
Not sure if this is the most efficient way, but here's my take:
x = random2 + 2*random2
What can happen:
0 + 0 = 0
0 + 2 = 2
1 + 0 = 1
1 + 2 = 3
The above are all the possibilities of what can happen, thus each has equal probability, so...
(p(x=c) is the probability that x = c)
p(x=0) = 0.25
p(x=1) = 0.25
p(x=2) = 0.25
p(x=3) = 0.25
Now while x = 3, we just keep generating another number, thus giving equal probability to 0,1,2. More technically, you would distribute the probability from x=3 across all of them repeatedly such that p(x=3) tends to 0, thus the probability of the others will tend to 0.33 each.
Code:
do
val = random2() + 2*random2();
while (val != 3);
return val;
random4:
Let's run through your code:
if(random2())
return random2();
return random2() + 2;
First call has 50% chance of 1 (true) => returns either 0 or 1 with 50% * 50% probability, thus 25% each
First call has 50% chance of 0 (false) => returns either 2 or 3 with 50% * 50% probability, thus 25% each
Thus your code generates 0,1,2,3 with equal probability.
Update inspired by e4e5f4's answer:
For a more deterministic answer than the one I provided above...
Generate some large number by calling random2 a bunch of times and mod the result by the desired number.
This won't be exactly the right probability for each, but it will be close.
So, for a 32-bit integer by calling random2 32 times, target = 3:
Total numbers: 4294967296
Number of x's such that x%3 = 1 or 2: 1431655765
Number of x's such that x%3 = 0: 1431655766
Probability of 1 or 2 (each): 0.33333333325572311878204345703125
Probability of 0: 0.3333333334885537624359130859375
So within 0.00000002% of the correct probability, seems pretty close.
Code:
sum = 0;
for (int i = 0; i < 32; i++)
sum = 2*sum + random2();
return sum % N;
Note:
As pjr pointed out, this is, in general, far less efficient than the rejection method above. The probability of getting to the same number of calls of random2 (i.e. 32) (assuming this is the slowest operation) with the rejection method is 0.25^(32/2) = 0.0000000002 = 0.00000002%. This together with the fact that this method isn't exact, gives way more preference to the rejection method. Lower this number decreases the running time, but increases the error, and it would probably need to be lowered quite a bit (thus reaching a high error) to approach the average running time of the rejection method.
It is useful to note the above algorithm has a maximum running time. The rejection method does not. If your random number generator is totally broken for some reason, it could keep generating the rejected number and run for quite a while or forever with the rejection method, but the for-loop above will run 32 times, regardless of what happens.
Using modulo(%) is not recommended because it introduces bias. Mapping will be nice only if n is power of 2. Otherwise some kind of rejection is involved as suggested by other answer.
Another generic approach would be to emulate built-in PRNGs by -
Generate 32 random2() and map it to a 32-bit integer
Get random number in range (0,1) by dividing it by max integer value
Simply multiply this number by n (=3,4...73 so on) and floor to get desired output

On average, how many times will this incorrect loop iterate?

In some cases, a loop needs to run for a random number of iterations that ranges from min to max, inclusive. One working solution is to do something like this:
int numIterations = randomInteger(min, max);
for (int i = 0; i < numIterations; i++) {
/* ... fun and exciting things! ... */
}
A common mistake that many beginning programmers make is to do this:
for (int i = 0; i < randomInteger(min, max); i++) {
/* ... fun and exciting things! ... */
}
This recomputes the loop upper bound on each iteration.
I suspect that this does not give a uniform distribution of the number of times the loop will iterate that ranges from min to max, but I'm not sure exactly what distribution you do get when you do something like this. Does anyone know what the distribution of the number of loop iterations will be?
As a specific example: suppose that min = 0 and max = 2. Then there are the following possibilities:
When i = 0, the random value is 0. The loop runs 0 times.
When i = 0, the random value is nonzero. Then:
When i = 1, the random value is 0 or 1. Then the loop runs 1 time.
When i = 1, the random value is 2. Then the loop runs 2 times.
The probability of this first event is 1/3. The second event has probability 2/3, and within it, the first subcase has probability 2/3 and the second event has probability 1/3. Therefore, the average number of distributions is
0 × 1/3 + 1 × 2/3 × 2/3 + 2 × 2/3 × 1/3
= 0 + 4/9 + 4/9
= 8/9
Note that if the distribution were indeed uniform, we'd expect to get 1 loop iteration, but now we only get 8/9 on average. My question is whether it's possible to generalize this result to get a more exact value on the number of iterations.
Thanks!
Final edit (maybe!). I'm 95% sure that this isn't one of the standard distributions that are appropriate. I've put what the distribution is at the bottom of this post, as I think the code that gives the probabilities is more readable! A plot for the mean number of iterations against max is given below.
Interestingly, the number of iterations tails off as you increase max. Would be interesting if someone else could confirm this with their code.
If I were to start modelling this, I would start with the geometric distribution, and try to modify that. Essentially we're looking at a discrete, bounded distribution. So we have zero or more "failures" (not meeting the stopping condition), followed by one "success". The catch here, compared to the geometric or Poisson, is that the probability of success changes (also, like the Poisson, the geometric distribution is unbounded, but I think structurally the geometric is a good base). Assuming min=0, the basic mathematical form for P(X=k), 0 <= k <= max, where k is the number of iterations the loop runs, is, like the geometric distribution, the product of k failure terms and 1 success term, corresponding to k "false"s on the loop condition and 1 "true". (Note that this holds even to calculate the last probability, as the chance of stopping is then 1, which obviously makes no difference to a product).
Following on from this, an attempt to implement this in code, in R, looks like this:
fx = function(k,maximum)
{
n=maximum+1;
failure = factorial(n-1)/factorial(n-1-k) / n^k;
success = (k+1) / n;
failure * success
}
This assumes min=0, but generalizing to arbitrary mins isn't difficult (see my comment on the OP). To explain the code. First, as shown by the OP, the probabilities all have (min+1) as a denominator, so we calculate the denominator, n. Next, we calculate the product of the failure terms. Here factorial(n-1)/factorial(n-1-k) means, for example, for min=2, n=3 and k=2: 2*1. And it generalises to give you (n-1)(n-2)... for the total probability of failure. The probability of success increases as you get further into the loop, until finally, when k=maximum, it is 1.
Plotting this analytic formula gives the same results as the OP, and the same shape as the simulation plotted by John Kugelman.
Incidentally the R code to do this is as follows
plot_probability_mass_function = function(maximum)
{
x=0:maximum;
barplot(fx(x,max(x)), names.arg=x, main=paste("max",maximum), ylab="P(X=x)");
}
par(mfrow=c(3,1))
plot_probability_mass_function(2)
plot_probability_mass_function(10)
plot_probability_mass_function(100)
Mathematically, the distribution is, if I've got my maths right, given by:
which simplifies to
(thanks a bunch to http://www.codecogs.com/latex/eqneditor.php)
The latter is given by the R function
function(x,m) { factorial(m)*(x+1)/(factorial(m-x)*(m+1)^(x+1)) }
Plotting the mean number of iterations is done like this in R
meanf = function(minimum)
{
x = 0:minimum
probs = f(x,minimum)
x %*% probs
}
meanf = function(maximum)
{
x = 0:maximum
probs = f(x,maximum)
x %*% probs
}
par(mfrow=c(2,1))
max_range = 1:10
plot(sapply(max_range, meanf) ~ max_range, ylab="Mean number of iterations", xlab="max")
max_range = 1:100
plot(sapply(max_range, meanf) ~ max_range, ylab="Mean number of iterations", xlab="max")
Here are some concrete results I plotted with matplotlib. The X axis is the value i reached. The Y axis is the number of times that value was reached.
The distribution is clearly not uniform. I don't know what distribution it is offhand; my statistics knowledge is quite rusty.
1. min = 10, max = 20, iterations = 100,000
2. min = 100, max = 200, iterations = 100,000
I believe that it would still, given a sufficient amount of executions, conform to the distribution of the randomInteger function.
But this is probably a question better suited to be asked on MATHEMATICS.
I don’t know the math behind it, but I know how to compute it! In Haskell:
import Numeric.Probability.Distribution
iterations min max = iteration 0
where
iteration i = do
x <- uniform [min..max]
if i < x
then iteration (i + 1)
else return i
Now expected (iterations 0 2) gives you the expected value of ~0.89. Maybe someone with the requisite math knowledge can explain what I’m actually doing here. Because you start at 0, the loop will always run at least min times.

Generate a Random Number within a Range

I have done this before, but now I'm struggling with it again, and I think I am not understanding the math underlying the issue.
I want to set a random number on within a small range on either side of 1. Examples would be .98, 1.02, .94, 1.1, etc. All of the examples I find describe getting a random number between 0 and 100, but how can I use that to get within the range I want?
The programming language doesn't really matter here, though I am using Pure Data. Could someone please explain the math involved?
Uniform
If you want a (psuedo-)uniform distribution (evenly spaced) between 0.9 and 1.1 then the following will work:
range = 0.2
return 1-range/2+rand(100)*range/100
Adjust the range accordingly.
Pseudo-normal
If you wanted a normal distribution (bell curve) you would need special code, which would be language/library specific. You can get a close approximation with this code:
sd = 0.1
mean = 1
count = 10
sum = 0
for(int i=1; i<count; i++)
sum=sum+(rand(100)-50)
}
normal = sum / count
normal = normal*sd + mean
Generally speaking, to get a random number within a range, you don't get a number between 0 and 100, you get a number between 0 and 1. This is inconsequential, however, as you could simply get the 0-1 number by dividing your # by 100 - so I won't belabor the point.
When thinking about the pseudocode of this, you need to think of the number between 0 and 1 which you obtain as a percentage. In other words, if I have an arbitrary range between a and b, what percentage of the way between the two endpoints is the point I have randomly selected. (Thus a random result of 0.52 means 52% of the distance between a and b)
With this in mind, consider the problem this way:
Set the start and end-points of your range.
var min = 0.9;
var max = 1.1;
Get a random number between 0 and 1
var random = Math.random();
Take the difference between your start and end range points (b - a)
var range = max - min;
Multiply your random number by the difference
var adjustment = range * random;
Add back in your minimum value.
var result = min + adjustment;
And, so you can understand the values of each step in sequence:
var min = 0.9;
var max = 1.1;
var random = Math.random(); // random == 0.52796 (for example)
var range = max - min; // range == 0.2
var adjustment = range * random; // adjustment == 0.105592
var result = min + adjustment; // result == 1.005592
Note that the result is guaranteed to be within your range. The minimum random value is 0, and the maximum random value is 1. In these two cases, the following occur:
var min = 0.9;
var max = 1.1;
var random = Math.random(); // random == 0.0 (minimum)
var range = max - min; // range == 0.2
var adjustment = range * random; // adjustment == 0.0
var result = min + adjustment; // result == 0.9 (the range minimum)
var min = 0.9;
var max = 1.1;
var random = Math.random(); // random == 1.0 (maximum)
var range = max - min; // range == 0.2
var adjustment = range * random; // adjustment == 0.2
var result = min + adjustment; // result == 1.1 (the range maximum)
return 0.9 + rand(100) / 500.0
or am I missing something?
If rand() returns you a random number between 0 and 100, all you need to do is:
(rand() / 100) * 2
to get a random number between 0 and 2.
If on the other hand you want the range from 0.9 to 1.1, use the following:
0.9 + ((rand() / 100) * 0.2)
You can construct any distribution you like form uniform in range [0,1) by changing variable. Particularly, if you want random of some distribution with cumulative distribution function F, you just substitute uniform random from [0,1) to inverse function for desired CDF.
One special (and maybe most popular) case is normal distribution N(0,1). Here you can use Box-Muller transform. Scaling it with stdev and adding a mean you get normal distribution with desired parameters.
You can sum uniform randoms and get some approximation of normal distribution, this case is considered by Nick Fortescue above.
If your source randoms are integers you should firstly construct a random in real domain with some known distribution. For example, uniform distribution in [0,1) you can construct such way. You get first integer in range from 0 to 99, multiply it by 0.01, get second integer, multiply it by 0.0001 and add to first and so on. This way you get a number 0.XXYYZZ... Double precision is about 16 decimal digits, so you need 8 integer randoms to construct double uniform one.
Box-Müller to the rescue.
var z2_cached;
function normal_random(mean, variance) {
if ( z2_cached ) {
var z2 = z2_cached;
z2_cached = 0
return z2 * Math.sqrt(variance) + mean;
}
var x1 = Math.random();
var x2 = Math.random();
var z1 = Math.sqrt(-2 * Math.log(x1) ) * Math.cos( 2*Math.PI * x2);
var z2 = Math.sqrt(-2 * Math.log(x1) ) * Math.sin( 2*Math.PI * x2);
z2_cached = z2;
return z1 * Math.sqrt(variance) + mean;
}
Use with values of mean 1 and variance e.g. 0.01
for ( var i=0; i < 20; i++ ) console.log( normal_random(1, 0.01) );
0.937240893365304
1.072511121460833
0.9950053748909895
1.0034139439164074
1.2319710866884104
0.9834737343090275
1.0363970887198277
0.8706648577217094
1.0882382154101415
1.0425139197341595
0.9438723605883214
0.935894021237943
1.0846400276817076
1.0428213927823682
1.020602499547105
0.9547701472093025
1.2598174560413493
1.0086997644531541
0.8711594789918106
0.9669499056660755
Function gives approx. normal distribution around mean with given variance.
low + (random() / 100) * range
So for example:
0.90 + (random() / 100) * 0.2
How near? You could use a Gaussian (a.k.a. Normal) distribution with a mean of 1 and a small standard deviation.
A Gaussian is suitable if you want numbers close to 1 to be more frequent than numbers a bit further away from 1.
Some languages (such as Java) will have support for Gaussians in the standard library.
Divide by 100 and add 1. (I assume you are looking for a range from 0 to 2?)
You want a range from -1 to 1 as output from your rand() expression.
( rand(2) - 1 )
Then scale that -1 to 1 range as needed. Say, for a .1 variation on either side:
(( rand(2) - 1 ) / 10 )
Then just add one.
(( rand(2) - 1 ) / 10 ) + 1
Rand() already gives you a random number between 0 and 100. The maximum different random number you can get with this are 100 thus Assuming that you want up to three decimal numbers 0.950-1.050 is the range you would be looking at.
The distribution can then be achieved by
0.95 + ((rand() / 100)
Are you looking for the random no. from range 1 to 2, like 1.1,1.5,1.632, etc. if yes then here is a simple python code:
import random
print (random.random%2)+1
var randomNumber = Math.random();
while(randomNumber<0.9 && randomNumber>0.1){
randomNumber = Math.random();
}
if(randomNumber>=0.9){
alert(randomNumber);
}
else if(randomNumber<=0.1){
alert(1+randomNumber);
}
For numbers from 0.9 to 1.1
seed = 1
range = 0,1
if your random is from 0..100
f_rand = random/100
the generated number
gen_number = (seed+f_rand*range*2)-range
You will get
1,04; 1,08; 1,01; 0,96; ...
with seed 3, range 2 => 1,95; 4,08; 2,70; 3,06; ...
I didn't understand this (sorry):
I am trying to set a random number on either side of 1: .98, 1.02, .94, 1.1, etc.
So, I'll provide a general solution for the problem instead.
Converting a random number generator
If you have a random number generator in a give range [0, 1)* with uniform distribution you can convert it to any distribution using the following method:
1 - Describe the distribution as a function defined in the output range and with total area of 1. So this function is f(x) = the probability of getting the value x.
2 - Integrate** the function.
3 - Equate it to the "randomic"*.
4 - Solve the equation for x. So ti gives you the value of x in function of the randomic.
*: Generalization for any input distribution is below.
**: The constant term of the integrated function is 0 (that is, you just discard it).
**: That is a variable the represents the result of generating a random number with uniform distribution in the range [0, 1). [I'm not sure if that's the correct name in English]
Example:
Let's say you want a value with the distribution f(x)=x^2 from 0 to 100. Well that function is not normalized because the total area below the function in the range is 1000000/3 not 1. So you normalize it scaling the curve in the vertical axis (keeping the relative proportions), that is dividing by the total area: f(x)=3*x^2 / 1000000 from 0 to 100.
Now, we have a function with the a total area of 1. The next step is to integrate it (you may have already have done that to get the area) and equte it to the randomic.
The integrated function is: F(x)=x^3/1000000+c. And equate it to the randomic: r=x^3/1000000 (remember that we discard the constant term).
Now, we need to solve the equation for x, the resulting expression: x=100*r^(1/3). Now you can use this formula to generate numbers with the desired distribution.
Generalization
If you have a random number generator with a custom distribution and want another different arbitrary distribution, you first need the source distribution function and then use it to express the target arbirary random number generator. To get the distribution function do the steps up to 3. For the target do all the steps, and then replace the randomic with the expression you got from the source distribution.
This is better understood with an example...
Example:
You have a random number generator with uniform distribution in the range [0, 100) and you want.. the same distribution f(x)=3*x^2 / 1000000 from 0 to 100 for simplicity [Since for that one we already did all the steps giving us x=100*r^(1/3)].
Since the source distribution is uniform the function is constant: f(z)=1. But we need to normalize for the range, leaving us with: f(z)=1/100.
Now, we integrate it: F(z)=z/100. And equate it to the randomic: r=z/100, but this time we don't solve it for x, instead we use it to replace r in the target:
x=100*r^(1/3) where r = z/100
=>
x=100*(z/100)^(1/3)
=>
x=z^(1/3)
And now you can use x=z^(1/3) to calculate random numbers with the distribution f(x)=3*x^2 / 1000000 from 0 to 100 starting with a random number in the distribution f(z)=1/100 from 0 to 100 [uniform].
Note: If you have normal distribution, use the bell function instead. The same method works for any other distribution. Take care of possible asymptote some distributions make create, you may need to try different ways to solve the equations.
On discrete distributions
Some times you need to express a discrete distribution, for example, you want to get 0 with 95% chance and 1 with 5% chance. So how do you do that?
Well, you divide it in rectangular distributions in such way that the ranges join to [0, 1) and use the randomic to evaluate:
0 if r is in [0, 0.95)
f(r) = {
1 if r is in [0.95, 1)
Or you can take the complex path, which is to write a distribution function like this (making each option exactly a range of length 1):
0.95 if x is in [0, 1)
f(x) = {
0.5 if x is in [1, 2)
Since each range has a length of 1 and the assigned values sum up to 1 we know that the total area is 1. Now the next step would be to integrate it:
0.95*x if x is in [0, 1)
F(x) = {
(0.5*(x-1))+0.95 = 0.5*x + 0.45 if x is in [1, 2)
Equate it to the randomic:
0.95*x if x is in [0, 1)
r = {
0.5*x + 0.45 if x is in [1, 2)
And solve the equation...
Ok, to solve that kind of equation, start by calculating the output ranges by applying the function:
[0, 1) becomes [0, 0.95)
[1, 2) becomes [0.95, {(0.5*(x-1))+0.95 where x = 2} = 1)
Now, those are the ranges for the solution:
? if r is in [0, 0.95)
x = {
? if r is in [0.95, 1)
Now, solve the inner functions:
r/0.95 if r is in [0, 0.95)
x = {
2*(r-0.45) = 2*r-0.9 if r is in [0.95, 1)
But, since the output is discrete, we end up with the same result after doing integer part:
0 if r is in [0, 0.95)
x = {
1 if r is in [0.95, 1)
Note: using random to mean pseudo random.
Edit: Found it on wikipedia (I knew I didn't invent it).

Resources