I have this formula
D5 = 10000
D4 = 9726.76
=SUM((D5 - D4) * ((1500 *1) / D4))
The results will be: 42.13736126
And I'm looking for way to provide the result and get the D5 (10000);
I hope this is understandable enough :)
I don't know what cell the 42.137 is in, but the formula is:
(Result*D4)/(1*1500)+D4
Related
Running a simulation of the St Petersberg game paradox, and I need to create these dummy variables for a certain number of outcomes. I have a tried appending in a for loop but cant seem to come out with the correct answer. Attached are the variables I need to create.
P(i) is an array in the form of {1/2^1,1/2^2,..,1/2^n}
It's pretty easy to do this in R
# P= .5^k:r
P = .5^1:100
d1 = sum(P)
d2 = sum(P[-1]) # or just d1-.5
Or just by using the geometric sum formula :
d1 = (1-.5^100)
d2 = .5(1-.5^99)
While using the bhattacharya.dist() function available in library("fps") we run into a bug that gives -Inf when calculating this distance for two distributions which are the same. (The correct value is zero in this case).
This happens because of multiplication of two large values in the function. I have posted the old code and the code with the fix below.
Could some experts please verify if this is correct and also suggest how this fix can be brought to the attention of the owners of this library and distributed out etc. Please note, I am very new to R and statistics etc. Hence I am concerned there might be some issues with this fix.
I have run some tests with matrices of different sizes and see expected results. But admittedly, (due to my lack of software development skills) it is nowhere close to a thorough test.
OLD ROUTINE
bhattacharyya.dist
function (mu1, mu2, Sigma1, Sigma2)
{
aggregatesigma <- (Sigma1 + Sigma2)/2
d1 <- mahalanobis(mu1, mu2, aggregatesigma)/8
d2 <- log(det(as.matrix(aggregatesigma))/sqrt(det(as.matrix(Sigma1)) *
det(as.matrix(Sigma2))))/2
out <- d1 + d2
out
}
NEW ROUTINE WITH BUG FIX
bhattacharyyaDistance = function (mu1, mu2, Sigma1, Sigma2)
{
aggregatesigma <- (Sigma1 + Sigma2)/2;
d1 <- mahalanobis(mu1, mu2, aggregatesigma)/8;
#d2 <- log(det(as.matrix(aggregatesigma))/sqrt(det(as.matrix(Sigma1)) *
# det(as.matrix(Sigma2))))/2;
d2 <- log((det(as.matrix(aggregatesigma))/sqrt(det(as.matrix(Sigma1))))
/ sqrt(det(as.matrix(Sigma2))))/2;
out <- d1 + d2;
return(out);
};
I would like to compute log( exp(A1) + exp(A2) ).
The formula below
log(exp(A1) + exp(A2) ) = log[exp(A1)(1 + exp(A2)/exp(A1))] = A1 + log(1+exp(A2-A1))
is useful when A1 and A2 are large and numerically exp(A1)=Inf (or exp(A2)=Inf).
(this formula is discussed in this thread ->
How to calculate log(sum of terms) from its component log-terms). The formula is true when the role of A1 and A2 are replaced.
My concern of this formula is when A1 and A2 are very small. For example, when A1 and A2 are:
A1 <- -40000
A2 <- -45000
then the direct calculation of log(exp(A1) + exp(A2) ) is:
log(exp(A1) + exp(A2))
[1] -Inf
Using the formula above gives:
A1 + log(1 + exp(A2-A1))
[1] -40000
which is the value of A1.
Ising the formula above with flipped role of A1 and A2 gives:
A2 + log(1 + exp(A1-A2))
[1] Inf
Which of the three values are the closest to the true value of log(exp(A1) + exp(A2))? Is there robust way to compute log(exp(A1) + exp(A2)) that can be used both when A1, A2 are small and A1, A2 are large.
Thank you in advance
You should use something with more accuracy to do the direct calculation.
It’s not “useful when [they’re] large”. It’s useful when the difference is very negative.
When x is near 0, then log(1+x) is approximately x. So if A1>A2, we can take your first formula:
log(exp(A1) + exp(A2)) = A1 + log(1+exp(A2-A1))
and approximate it by A1 + exp(A2-A1) (and the approximation will get better as A2-A1 is more negative). Since A2-A1=-5000, this is more than negative enough to make the approximation sufficient.
Regardless, if y is too far from zero (either way) exp(y) will (over|under)flow a double and result in 0 or infinity (this is a double, right? what language are you using?). This explains your answers. But since exp(A2-A1)=exp(-5000) is close to zero, your answer is approximately -40000+exp(-5000), which is indistinguishable from -40000, so that one is correct.
in such huge exponent differences the safest you can do without arbitrary precision is
chose the biggest exponent let it be Am = max(A1,A2)
so: log(exp(A1)+exp(A2)) -> log(exp(Am)) = Am
that is the closest you can get for such case
so in your example the result is -40000+delta
where delta is something very small
If you want to use the second formula then all breaks down to computing log(1+exp(A))
if A is positive then the result is far from the real thing
if A is negative then it will truncate to log(1)=0 so you get the same result as in above
[Notes]
your exponent difference is base^500
single precision 32bit float can store numbers up to (+/-)2^(+/-128)
double precision 64bit float can store numbers up to (+/-)2^(+/-1024)
so when your base is 10 or e then this is nowhere near enough what you need
if you have quadruple precision that should be enough but when you start changing the exp difference again yo will quickly get to the same point as now
[PS] if you need more precision without arbitrary precision
you can try to create own number class
with internal store of numbers like number=a^b
where a,b are floats
but for that you would need to code all basic functions
*,/ is easy
+,- is a nightmare but there could be some approaches/algorithms out there even for this
I need to calculate a list of very small numbers such as
(0.1)^1000, 0.2^(1200),
and then normalize them so they will sum up to one
i.e.
a1 = 0.1^1000,
a2 = 0.2^1200
And I want to calculate
a1' = a1/(a1+a2),
a2'=a2(a1+a2).
I'm running into underflow problems, as I get a1=0. How can I get around this?
Theoretically I could deal with logs, and then log(a1) = 1000*log(0.l) would be a way to represent a1 without underflow problems - But in order to normalize I would need to get
log(a1+a2) - which I can't compute since I can't represent a1 directly.
I'm programming with R - as far as I can tell there is no data type such Decimal in c# which
allows you to get better than double-precision value.
Any suggestions will be appreciated, thanks
Mathematically spoken, one of those numbers will be appx. zero, and the other one. The difference between your numbers is huge, so I'm even wondering if this makes sense.
But to do that in general, you can use the idea from the logspace_add C-function that's underneath the hood of R. One can define logxpy ( =log(x+y) ) when lx = log(x) and ly = log(y) as :
logxpy <- function(lx,ly) max(lx,ly) + log1p(exp(-abs(lx-ly)))
Which means that we can use :
> la1 <- 1000*log(0.1)
> la2 <- 1200*log(0.2)
> exp(la1 - logxpy(la1,la2))
[1] 5.807714e-162
> exp(la2 - logxpy(la1,la2))
[1] 1
This function can be called recursively as well if you have more numbers. Mind you, 1 is still 1, and not 1 minus 5.807...e-162 . If you really need more precision and your platform supports long double types, you could code everything in eg C or C++, and return the results later on. But if I'm right, R can - for the moment - only deal with normal doubles, so ultimately you'll lose the precision again when the result is shown.
EDIT :
to do the math for you :
log(x+y) = log(exp(lx)+exp(ly))
= log( exp(lx) * (1 + exp(ly-lx) )
= lx + log ( 1 + exp(ly - lx) )
Now you just take the largest as lx, and then you come at the expression in logxpy().
EDIT 2 : Why take the maximum then? Easy, to assure that you use a negative number in exp(lx-ly). If lx-ly gets too big, then exp(lx-ly) would return Inf. That's not a correct result. exp(ly-lx) would return 0, which allows for a far better result:
Say lx=1 and ly=1000, then :
> 1+log1p(exp(1000-1))
[1] Inf
> 1000+log1p(exp(1-1000))
[1] 1000
The Brobdingnag package deals with very large or small numbers, essentially wrapping Joris's answer into a convenient form.
a1 <- as.brob(0.1)^1000
a2 <- as.brob(0.2)^1200
a1_dash <- a1 / (a1 + a2)
a2_dash <- a2 / (a1 + a2)
as.numeric(a1_dash)
as.numeric(a2_dash)
Try the arbitrary precision packages:
Rmpfr "R MPFR - Multiple Precision Floating-Point Reliable"
Ryacas "R Interface to the 'Yacas' Computer Algebra System" - may also be able to do arbitrary precision.
Maybe you can treat a1 and a2 as fractions. In your example, with
a1 = (a1num/a1denom)^1000 # 1/10
a2 = (a2num/a2denom)^1200 # 1/5
you would arrive at
a1' = (a1num^1000 * a2denom^1200)/(a1num^1000 * a2denom^1200 + a1denom^1000 * a2num^1200)
a2' = (a1denom^1000 * a2num^1200)/(a1num^1000 * a2denom^1200 + a1denom^1000 * a2num^1200)
which can be computed using the gmp package:
library(gmp)
a1 <- as.double(pow.bigz(5,1200) / (pow.bigz(5,1200)+ pow.bigz(10,1000)))
This question states:
A Pythagorean triplet is a set of three natural numbers, a b c, for which,
a2 + b2 = c2
For example, 32 + 42 = 9 + 16 = 25 = 52.
There exists exactly one Pythagorean triplet for which a + b + c = 1000.
Find the product abc.
I'm not sure what's it trying to ask you. Are we trying to find a2 + b2 = c2 and then plug those numbers into a + b + c = 1000?
You need to find the a, b, and c such that both a2 + b2 = c2 and a + b + c = 1000. Then you need to output the product a * b * c.
These problems are often solvable trivially, if you find the proper insight. The trick here is to use a little algebra before you ever write a loop. I'll give you one hint. Look at the formula to generate pythagorean triples. Can you write the sum of the side lengths in a useful way?
Like a large number of project euler problems, it's all about finding a set of numbers that simultaneously fulfil multiple constraints.
In this case, the constraints are:
1) a^2 + b^2 = c^2
2) a+b+c = 1000
In the early questions the solution can be as simple as nested loops which try each possible combination.