Best way to get a number from a range? - math

This is a little more math orientated but I would like to know the best way to do the following;
min = 20;
max = 80;
Given a number 1 through x, what is the best way to linearly distribute these numbers evenly?
For example when n=1 value is always min, when n=x, value is always max so 80 in this case.
When n=x/2 value is 35 (mid point between min and max)
If this were to be a function like double getNum(min, max, x, n) which returns the value of the number between min/max, what would be the best way to write this?

It's pretty simple to derive the following formula:
double nth(double min, double max, int n, int x) {
return min + (max-min) * ((double) n-1) / (x-1);
}

Related

Generate random decimal numbers with given mean in given range in R

Hey I want to generate 100 decimal numbers in the range of 10 and 50 with the mean of 32.2.
I can use this to generate the numbers in the wanted range, but I don't get the mean:
runif(100, min=10, max=50)
Or I could use this and I dont get the range:
rnorm(100,mean=32.2,sd=10)
How can I combine those two or can I use another function?
I have tried to use this approach:
R - random distribution with predefined min, max, mean, and sd values
But I dont get the exact mean I want... (31.7 in my example try)
n <- 100
y <- rgbeta(n, mean = 32.2, var = 200, min = 10, max = 50)
Edit: Ok i have lowered the var and the mean gets near to 32.2 but I still want some values near the min and max range...
In order to get random numbers between 10 and 50 with a (true) mean of 32.2, you would need a density function that would fulfill those properties.
A uniform distribution with a min of 10 and a max of 50 (runif) will never deliver you that mean, as the true mean is 30 for that distribution.
The normal distribution has a range from - infinity to infinity, independent of the mean it has, so runif will return numbers greater than 50 and smaller than 10.
You could use a truncated normal distribution
rnormTrunc(n = 100, mean = 32.2, sd = 1, min = 10, max = 50),
if that distribution would be okay. If you need a different distibution, things will get a little more complicated.
Edit: feel free to ask if you need the math behind that, but depending on what your density function should look like it will get very complicated
This isn't perfect, but maybe its a start. I can't get the range to work out perfectly, so I just played with the "max" until I got an output I was happy with. There is probably a more solid math way to do this. The result is uniform-adjacent... at best...
rand_unif_constrained <- function(num, min, max, mean) {
vec <- runif(num, min, max)
vec / sum(vec) * mean*num
}
set.seed(35)
test <- rand_unif_constrained(100, 10, 40, 32.2) #play with max until max output is less that 50
mean(test)
#> [1] 32.2
min(test)
#> [1] 12.48274
max(test)
#> [1] 48.345
hist(test)

Math-ish recursion to formula

Ok,
so this is a application of existing mathematical practices, but I can't really apply them to my case.
So, I have x of a currency to increase the level of a game-object y for cost z.
z is calculated in cost(y.lvl) = c_1 * c_2^y.lvl / c_3, where the c's are constants.
I am seeking an efficient way to calculate, how often I can increase the level of y, given x. Currently I'm using a loop that does something like this:
double tempX = x;
int counter = 0;
while(tempX >= cost(y.lvl+counter)){
tempX-=cost(y.lvl)+counter;
counter++;
}
The problem is, that in some cases, this loop has to iterate too many times to stay performant.
What I am looking for is essentially a function
int howManyCanBeBought(x,y.lvl), which calculates it's result in a single go, instead of looping a lot of times.
I've read something about transforming recursions to generating functions and transforming them to closed formulas, but I didn't get the math behind it. Is there an easy way to it?
If I understand correctly, you're looking for the largest n such that:
Σi=0..n c1/c3 c2lvl+i ≤ x
Dividing by the constant factor:
Σi=0..n c2i ≤ c3 / (c1 c2lvl) x
Using the formula for the sum of a geometric series:
(c2n+1 - 1) / (c2 - 1) ≤ c3 / (c1 c2lvl) x
And solving for the maximum integer:
n = floor(logc2(c3 (c2 - 1) / (c1 c2lvl) x + 1) - 1)

How can I know the mean value of a function in an interval?

Suppose I have a function, something like:
fun <- function(x) 2.3*exp(-x/2)
and I want to get the mean value of this function along an interval, suppose from 2 to 20.
to get the mean, the first that comes to my mind is this:
mean(fun(2:20))
so as simple as start giving values to the function and computing the mean.
However I wonder if is there any other way more precise to obtain this. Any idea?
Analytically, you can determine the mean value of a function on the interval [a,b] using:
So, after taking the integral, you can evaluate the function at two points and get the mean value analytically. In your case this leads to an integral of -4.6 * exp(0.5 * x), and a mean value of 1/(20-2) * (-4.6 * exp(-0.5 * 20) + 4.6 * exp(-0.5 * 2)) = 0.09400203.
Now I focus now on sampling along the interval, and calculating the mean like that:
get_sample_mean_from_function = function(func, interval, n = 1000) {
interval_samples = seq(interval[1], interval[2], length = n)
function_values = sapply(interval_samples, func)
return(mean(function_values))
}
fun <- function(x) 2.3*exp(-x/2)
get_sample_mean_from_function(fun, interval = c(2,20))
By increasing the number n (number of samples taken) you can increase the precision of your answer. This is how the mean value develops with increasing sample size:
n_list = c(1,4,10,15,25,50,100,500,1000,10e3,100e3,100e4,100e5)
mean_list = sapply(n_list,
function(x) get_sample_mean_from_function(fun,
interval = c(2,20), n = x))
library(ggplot2)
qplot(n_list, mean_list, geom = "point", log = "x")
Notice that it takes at least 1000 samples to get any convergence. If we compare this numerical solution with the analytical value:
mean_list - real_value
[1] 7.521207e-01 1.286106e-01 3.984653e-02 2.494165e-02 1.421951e-02
[6] 6.841070e-03 3.355199e-03 6.607662e-04 3.297467e-04 3.291750e-05
[11] 3.291179e-06 3.291122e-07 3.291116e-08
We see that even for 100e5 samples, the difference between the analytical and numerical solution is still significant compared to double floating point precision.
If you desperately need very high precision, I'd try and go for an analytical solution. However, in practice 5000 samples is more than enough to get reasonable accuracy.

On average, how many times will this incorrect loop iterate?

In some cases, a loop needs to run for a random number of iterations that ranges from min to max, inclusive. One working solution is to do something like this:
int numIterations = randomInteger(min, max);
for (int i = 0; i < numIterations; i++) {
/* ... fun and exciting things! ... */
}
A common mistake that many beginning programmers make is to do this:
for (int i = 0; i < randomInteger(min, max); i++) {
/* ... fun and exciting things! ... */
}
This recomputes the loop upper bound on each iteration.
I suspect that this does not give a uniform distribution of the number of times the loop will iterate that ranges from min to max, but I'm not sure exactly what distribution you do get when you do something like this. Does anyone know what the distribution of the number of loop iterations will be?
As a specific example: suppose that min = 0 and max = 2. Then there are the following possibilities:
When i = 0, the random value is 0. The loop runs 0 times.
When i = 0, the random value is nonzero. Then:
When i = 1, the random value is 0 or 1. Then the loop runs 1 time.
When i = 1, the random value is 2. Then the loop runs 2 times.
The probability of this first event is 1/3. The second event has probability 2/3, and within it, the first subcase has probability 2/3 and the second event has probability 1/3. Therefore, the average number of distributions is
0 × 1/3 + 1 × 2/3 × 2/3 + 2 × 2/3 × 1/3
= 0 + 4/9 + 4/9
= 8/9
Note that if the distribution were indeed uniform, we'd expect to get 1 loop iteration, but now we only get 8/9 on average. My question is whether it's possible to generalize this result to get a more exact value on the number of iterations.
Thanks!
Final edit (maybe!). I'm 95% sure that this isn't one of the standard distributions that are appropriate. I've put what the distribution is at the bottom of this post, as I think the code that gives the probabilities is more readable! A plot for the mean number of iterations against max is given below.
Interestingly, the number of iterations tails off as you increase max. Would be interesting if someone else could confirm this with their code.
If I were to start modelling this, I would start with the geometric distribution, and try to modify that. Essentially we're looking at a discrete, bounded distribution. So we have zero or more "failures" (not meeting the stopping condition), followed by one "success". The catch here, compared to the geometric or Poisson, is that the probability of success changes (also, like the Poisson, the geometric distribution is unbounded, but I think structurally the geometric is a good base). Assuming min=0, the basic mathematical form for P(X=k), 0 <= k <= max, where k is the number of iterations the loop runs, is, like the geometric distribution, the product of k failure terms and 1 success term, corresponding to k "false"s on the loop condition and 1 "true". (Note that this holds even to calculate the last probability, as the chance of stopping is then 1, which obviously makes no difference to a product).
Following on from this, an attempt to implement this in code, in R, looks like this:
fx = function(k,maximum)
{
n=maximum+1;
failure = factorial(n-1)/factorial(n-1-k) / n^k;
success = (k+1) / n;
failure * success
}
This assumes min=0, but generalizing to arbitrary mins isn't difficult (see my comment on the OP). To explain the code. First, as shown by the OP, the probabilities all have (min+1) as a denominator, so we calculate the denominator, n. Next, we calculate the product of the failure terms. Here factorial(n-1)/factorial(n-1-k) means, for example, for min=2, n=3 and k=2: 2*1. And it generalises to give you (n-1)(n-2)... for the total probability of failure. The probability of success increases as you get further into the loop, until finally, when k=maximum, it is 1.
Plotting this analytic formula gives the same results as the OP, and the same shape as the simulation plotted by John Kugelman.
Incidentally the R code to do this is as follows
plot_probability_mass_function = function(maximum)
{
x=0:maximum;
barplot(fx(x,max(x)), names.arg=x, main=paste("max",maximum), ylab="P(X=x)");
}
par(mfrow=c(3,1))
plot_probability_mass_function(2)
plot_probability_mass_function(10)
plot_probability_mass_function(100)
Mathematically, the distribution is, if I've got my maths right, given by:
which simplifies to
(thanks a bunch to http://www.codecogs.com/latex/eqneditor.php)
The latter is given by the R function
function(x,m) { factorial(m)*(x+1)/(factorial(m-x)*(m+1)^(x+1)) }
Plotting the mean number of iterations is done like this in R
meanf = function(minimum)
{
x = 0:minimum
probs = f(x,minimum)
x %*% probs
}
meanf = function(maximum)
{
x = 0:maximum
probs = f(x,maximum)
x %*% probs
}
par(mfrow=c(2,1))
max_range = 1:10
plot(sapply(max_range, meanf) ~ max_range, ylab="Mean number of iterations", xlab="max")
max_range = 1:100
plot(sapply(max_range, meanf) ~ max_range, ylab="Mean number of iterations", xlab="max")
Here are some concrete results I plotted with matplotlib. The X axis is the value i reached. The Y axis is the number of times that value was reached.
The distribution is clearly not uniform. I don't know what distribution it is offhand; my statistics knowledge is quite rusty.
1. min = 10, max = 20, iterations = 100,000
2. min = 100, max = 200, iterations = 100,000
I believe that it would still, given a sufficient amount of executions, conform to the distribution of the randomInteger function.
But this is probably a question better suited to be asked on MATHEMATICS.
I don’t know the math behind it, but I know how to compute it! In Haskell:
import Numeric.Probability.Distribution
iterations min max = iteration 0
where
iteration i = do
x <- uniform [min..max]
if i < x
then iteration (i + 1)
else return i
Now expected (iterations 0 2) gives you the expected value of ~0.89. Maybe someone with the requisite math knowledge can explain what I’m actually doing here. Because you start at 0, the loop will always run at least min times.

Generate a Random Number within a Range

I have done this before, but now I'm struggling with it again, and I think I am not understanding the math underlying the issue.
I want to set a random number on within a small range on either side of 1. Examples would be .98, 1.02, .94, 1.1, etc. All of the examples I find describe getting a random number between 0 and 100, but how can I use that to get within the range I want?
The programming language doesn't really matter here, though I am using Pure Data. Could someone please explain the math involved?
Uniform
If you want a (psuedo-)uniform distribution (evenly spaced) between 0.9 and 1.1 then the following will work:
range = 0.2
return 1-range/2+rand(100)*range/100
Adjust the range accordingly.
Pseudo-normal
If you wanted a normal distribution (bell curve) you would need special code, which would be language/library specific. You can get a close approximation with this code:
sd = 0.1
mean = 1
count = 10
sum = 0
for(int i=1; i<count; i++)
sum=sum+(rand(100)-50)
}
normal = sum / count
normal = normal*sd + mean
Generally speaking, to get a random number within a range, you don't get a number between 0 and 100, you get a number between 0 and 1. This is inconsequential, however, as you could simply get the 0-1 number by dividing your # by 100 - so I won't belabor the point.
When thinking about the pseudocode of this, you need to think of the number between 0 and 1 which you obtain as a percentage. In other words, if I have an arbitrary range between a and b, what percentage of the way between the two endpoints is the point I have randomly selected. (Thus a random result of 0.52 means 52% of the distance between a and b)
With this in mind, consider the problem this way:
Set the start and end-points of your range.
var min = 0.9;
var max = 1.1;
Get a random number between 0 and 1
var random = Math.random();
Take the difference between your start and end range points (b - a)
var range = max - min;
Multiply your random number by the difference
var adjustment = range * random;
Add back in your minimum value.
var result = min + adjustment;
And, so you can understand the values of each step in sequence:
var min = 0.9;
var max = 1.1;
var random = Math.random(); // random == 0.52796 (for example)
var range = max - min; // range == 0.2
var adjustment = range * random; // adjustment == 0.105592
var result = min + adjustment; // result == 1.005592
Note that the result is guaranteed to be within your range. The minimum random value is 0, and the maximum random value is 1. In these two cases, the following occur:
var min = 0.9;
var max = 1.1;
var random = Math.random(); // random == 0.0 (minimum)
var range = max - min; // range == 0.2
var adjustment = range * random; // adjustment == 0.0
var result = min + adjustment; // result == 0.9 (the range minimum)
var min = 0.9;
var max = 1.1;
var random = Math.random(); // random == 1.0 (maximum)
var range = max - min; // range == 0.2
var adjustment = range * random; // adjustment == 0.2
var result = min + adjustment; // result == 1.1 (the range maximum)
return 0.9 + rand(100) / 500.0
or am I missing something?
If rand() returns you a random number between 0 and 100, all you need to do is:
(rand() / 100) * 2
to get a random number between 0 and 2.
If on the other hand you want the range from 0.9 to 1.1, use the following:
0.9 + ((rand() / 100) * 0.2)
You can construct any distribution you like form uniform in range [0,1) by changing variable. Particularly, if you want random of some distribution with cumulative distribution function F, you just substitute uniform random from [0,1) to inverse function for desired CDF.
One special (and maybe most popular) case is normal distribution N(0,1). Here you can use Box-Muller transform. Scaling it with stdev and adding a mean you get normal distribution with desired parameters.
You can sum uniform randoms and get some approximation of normal distribution, this case is considered by Nick Fortescue above.
If your source randoms are integers you should firstly construct a random in real domain with some known distribution. For example, uniform distribution in [0,1) you can construct such way. You get first integer in range from 0 to 99, multiply it by 0.01, get second integer, multiply it by 0.0001 and add to first and so on. This way you get a number 0.XXYYZZ... Double precision is about 16 decimal digits, so you need 8 integer randoms to construct double uniform one.
Box-Müller to the rescue.
var z2_cached;
function normal_random(mean, variance) {
if ( z2_cached ) {
var z2 = z2_cached;
z2_cached = 0
return z2 * Math.sqrt(variance) + mean;
}
var x1 = Math.random();
var x2 = Math.random();
var z1 = Math.sqrt(-2 * Math.log(x1) ) * Math.cos( 2*Math.PI * x2);
var z2 = Math.sqrt(-2 * Math.log(x1) ) * Math.sin( 2*Math.PI * x2);
z2_cached = z2;
return z1 * Math.sqrt(variance) + mean;
}
Use with values of mean 1 and variance e.g. 0.01
for ( var i=0; i < 20; i++ ) console.log( normal_random(1, 0.01) );
0.937240893365304
1.072511121460833
0.9950053748909895
1.0034139439164074
1.2319710866884104
0.9834737343090275
1.0363970887198277
0.8706648577217094
1.0882382154101415
1.0425139197341595
0.9438723605883214
0.935894021237943
1.0846400276817076
1.0428213927823682
1.020602499547105
0.9547701472093025
1.2598174560413493
1.0086997644531541
0.8711594789918106
0.9669499056660755
Function gives approx. normal distribution around mean with given variance.
low + (random() / 100) * range
So for example:
0.90 + (random() / 100) * 0.2
How near? You could use a Gaussian (a.k.a. Normal) distribution with a mean of 1 and a small standard deviation.
A Gaussian is suitable if you want numbers close to 1 to be more frequent than numbers a bit further away from 1.
Some languages (such as Java) will have support for Gaussians in the standard library.
Divide by 100 and add 1. (I assume you are looking for a range from 0 to 2?)
You want a range from -1 to 1 as output from your rand() expression.
( rand(2) - 1 )
Then scale that -1 to 1 range as needed. Say, for a .1 variation on either side:
(( rand(2) - 1 ) / 10 )
Then just add one.
(( rand(2) - 1 ) / 10 ) + 1
Rand() already gives you a random number between 0 and 100. The maximum different random number you can get with this are 100 thus Assuming that you want up to three decimal numbers 0.950-1.050 is the range you would be looking at.
The distribution can then be achieved by
0.95 + ((rand() / 100)
Are you looking for the random no. from range 1 to 2, like 1.1,1.5,1.632, etc. if yes then here is a simple python code:
import random
print (random.random%2)+1
var randomNumber = Math.random();
while(randomNumber<0.9 && randomNumber>0.1){
randomNumber = Math.random();
}
if(randomNumber>=0.9){
alert(randomNumber);
}
else if(randomNumber<=0.1){
alert(1+randomNumber);
}
For numbers from 0.9 to 1.1
seed = 1
range = 0,1
if your random is from 0..100
f_rand = random/100
the generated number
gen_number = (seed+f_rand*range*2)-range
You will get
1,04; 1,08; 1,01; 0,96; ...
with seed 3, range 2 => 1,95; 4,08; 2,70; 3,06; ...
I didn't understand this (sorry):
I am trying to set a random number on either side of 1: .98, 1.02, .94, 1.1, etc.
So, I'll provide a general solution for the problem instead.
Converting a random number generator
If you have a random number generator in a give range [0, 1)* with uniform distribution you can convert it to any distribution using the following method:
1 - Describe the distribution as a function defined in the output range and with total area of 1. So this function is f(x) = the probability of getting the value x.
2 - Integrate** the function.
3 - Equate it to the "randomic"*.
4 - Solve the equation for x. So ti gives you the value of x in function of the randomic.
*: Generalization for any input distribution is below.
**: The constant term of the integrated function is 0 (that is, you just discard it).
**: That is a variable the represents the result of generating a random number with uniform distribution in the range [0, 1). [I'm not sure if that's the correct name in English]
Example:
Let's say you want a value with the distribution f(x)=x^2 from 0 to 100. Well that function is not normalized because the total area below the function in the range is 1000000/3 not 1. So you normalize it scaling the curve in the vertical axis (keeping the relative proportions), that is dividing by the total area: f(x)=3*x^2 / 1000000 from 0 to 100.
Now, we have a function with the a total area of 1. The next step is to integrate it (you may have already have done that to get the area) and equte it to the randomic.
The integrated function is: F(x)=x^3/1000000+c. And equate it to the randomic: r=x^3/1000000 (remember that we discard the constant term).
Now, we need to solve the equation for x, the resulting expression: x=100*r^(1/3). Now you can use this formula to generate numbers with the desired distribution.
Generalization
If you have a random number generator with a custom distribution and want another different arbitrary distribution, you first need the source distribution function and then use it to express the target arbirary random number generator. To get the distribution function do the steps up to 3. For the target do all the steps, and then replace the randomic with the expression you got from the source distribution.
This is better understood with an example...
Example:
You have a random number generator with uniform distribution in the range [0, 100) and you want.. the same distribution f(x)=3*x^2 / 1000000 from 0 to 100 for simplicity [Since for that one we already did all the steps giving us x=100*r^(1/3)].
Since the source distribution is uniform the function is constant: f(z)=1. But we need to normalize for the range, leaving us with: f(z)=1/100.
Now, we integrate it: F(z)=z/100. And equate it to the randomic: r=z/100, but this time we don't solve it for x, instead we use it to replace r in the target:
x=100*r^(1/3) where r = z/100
=>
x=100*(z/100)^(1/3)
=>
x=z^(1/3)
And now you can use x=z^(1/3) to calculate random numbers with the distribution f(x)=3*x^2 / 1000000 from 0 to 100 starting with a random number in the distribution f(z)=1/100 from 0 to 100 [uniform].
Note: If you have normal distribution, use the bell function instead. The same method works for any other distribution. Take care of possible asymptote some distributions make create, you may need to try different ways to solve the equations.
On discrete distributions
Some times you need to express a discrete distribution, for example, you want to get 0 with 95% chance and 1 with 5% chance. So how do you do that?
Well, you divide it in rectangular distributions in such way that the ranges join to [0, 1) and use the randomic to evaluate:
0 if r is in [0, 0.95)
f(r) = {
1 if r is in [0.95, 1)
Or you can take the complex path, which is to write a distribution function like this (making each option exactly a range of length 1):
0.95 if x is in [0, 1)
f(x) = {
0.5 if x is in [1, 2)
Since each range has a length of 1 and the assigned values sum up to 1 we know that the total area is 1. Now the next step would be to integrate it:
0.95*x if x is in [0, 1)
F(x) = {
(0.5*(x-1))+0.95 = 0.5*x + 0.45 if x is in [1, 2)
Equate it to the randomic:
0.95*x if x is in [0, 1)
r = {
0.5*x + 0.45 if x is in [1, 2)
And solve the equation...
Ok, to solve that kind of equation, start by calculating the output ranges by applying the function:
[0, 1) becomes [0, 0.95)
[1, 2) becomes [0.95, {(0.5*(x-1))+0.95 where x = 2} = 1)
Now, those are the ranges for the solution:
? if r is in [0, 0.95)
x = {
? if r is in [0.95, 1)
Now, solve the inner functions:
r/0.95 if r is in [0, 0.95)
x = {
2*(r-0.45) = 2*r-0.9 if r is in [0.95, 1)
But, since the output is discrete, we end up with the same result after doing integer part:
0 if r is in [0, 0.95)
x = {
1 if r is in [0.95, 1)
Note: using random to mean pseudo random.
Edit: Found it on wikipedia (I knew I didn't invent it).

Resources