Julia: How do you add/subtract distribution functions? - julia

As part of a mini-project where I am numerically solving a linear differential equation, I have to subtract a probability distribution from another distribution. Is there a way to do this in Julia? When I try:
a = Chi(3) - Uniform(0,1)
There is no method set up for this:
MethodError: no method matching -(::Chi{Float64}, ::Uniform{Float64})
Closest candidates are:
-(::UnivariateDistribution, ::Real) at C:\Users\Acer\.julia\packages\Distributions\Fl5RM\src\univariate\locationscale.jl:139
-(::ChainRulesCore.AbstractThunk, ::Any) at C:\Users\Acer\.julia\packages\ChainRulesCore\sHMAp\src\tangent_types\thunks.jl:30
-(::ChainRulesCore.ZeroTangent, ::Any) at C:\Users\Acer\.julia\packages\ChainRulesCore\sHMAp\src\tangent_arithmetic.jl:101
...

As I have said the convolve function is defined in Distributions.jl. Here is a documentation: https://juliastats.org/Distributions.jl/stable/convolution/. However this is not enough for your purposes as commented above.
Let me help you deriving the PDF of Chi(3)-Uniform(0,1) assuming they are independent.
Let X be the distribution you want, then we do Chi(3)+Uniform(-1,0) have its PDF as: f(x) = C(z+1)-C(x). Where C is CDF of Chi(3) distribution.
So there is a closed form for PDF of your distribution in terms of CDF of Chi(3).
(I am doing the computations in my head so it would be good if you double checked them)

Related

Julia: How you subtract a Normal distribution from a Chi distribution? [duplicate]

As part of a mini-project where I am numerically solving a linear differential equation, I have to subtract a probability distribution from another distribution. Is there a way to do this in Julia? When I try:
a = Chi(3) - Uniform(0,1)
There is no method set up for this:
MethodError: no method matching -(::Chi{Float64}, ::Uniform{Float64})
Closest candidates are:
-(::UnivariateDistribution, ::Real) at C:\Users\Acer\.julia\packages\Distributions\Fl5RM\src\univariate\locationscale.jl:139
-(::ChainRulesCore.AbstractThunk, ::Any) at C:\Users\Acer\.julia\packages\ChainRulesCore\sHMAp\src\tangent_types\thunks.jl:30
-(::ChainRulesCore.ZeroTangent, ::Any) at C:\Users\Acer\.julia\packages\ChainRulesCore\sHMAp\src\tangent_arithmetic.jl:101
...
As I have said the convolve function is defined in Distributions.jl. Here is a documentation: https://juliastats.org/Distributions.jl/stable/convolution/. However this is not enough for your purposes as commented above.
Let me help you deriving the PDF of Chi(3)-Uniform(0,1) assuming they are independent.
Let X be the distribution you want, then we do Chi(3)+Uniform(-1,0) have its PDF as: f(x) = C(z+1)-C(x). Where C is CDF of Chi(3) distribution.
So there is a closed form for PDF of your distribution in terms of CDF of Chi(3).
(I am doing the computations in my head so it would be good if you double checked them)

Find zero of a nonlinear equation using Julia

After a process usyng the SymPy in Julia, I generated a system of nonlinear equations. For the sake of simplicity, I am going to put an approximation here for the case of just a non-linear equation. What I get is something like this equation:
R = (p) -> -5.0488*p + p^2.81 - 3.38/( p^(-1.0) )^2.0
I can plot the R function
using Plots
plot(R, 0,8)
We can see that the R function has two zeros: p = 0 and 5.850< p < 8.75. I would like to find the positive zero. For this, I tryed the nlsolve function but with error:
using NLsolve
nlsolve(R , 5.8)
MethodError: no method matching nlsolve(::var"#1337#1338", ::Float64)
Closest candidates are:
nlsolve(::Any, ::Any, !Matched::AbstractArray; inplace, kwargs...)
First, Where am I going wrong with the nlsolve function?
If possible, I will appreciate a solution using SymPy package in Julia.
This question has been answered on the Julia discourse here: https://discourse.julialang.org/t/find-zero-of-a-nonlinear-equation-using-julia/61974
It's always helpful to cross-reference when asking on multiple platforms.
For reference, the solution was
using NLSolve
function R(F,p) #p is a vector too, not a number
F[1] = -5.0488*p[1] + p[1]^2.81 - 3.38/( p[1]^(-1.0) )^2.0
end
nlsolve(R , [5.8])

Random number simulation in R

I have been going through some random number simulation equations while i found out that as Pareto dosent have an inbuilt function.
RPareto is found as
rpareto <- function(n,a,l){
rp <- l*((1-runif(n))^(-1/a)-1)
rp
}
can someone explain the intuitive meaning behind this.
It's a well known result that if X is a continuous random variable with CDF F(.), then Y = F(X) has a Uniform distribution on [0, 1].
This result can be used to draw random samples of any continuous random variable whose CDF is known: generate u, a Uniform(0, 1) random variable and then determine the value of x for which F(x) = u.
In specific cases, there may well be more efficient ways of sampling from F(.), but this will always work as a fallback.
It's likely (I haven't checked the accuracy of the code myself, but it looks about right) that the body of your function solves f(x) = u for known u in order to generate a random variable with a Pareto distribution. You can check it with a little algebra after getting the CDF from this Wikipedia page.

Range for continuos distribution in Julia

I am trying to calculate the density function of a continuos random variable in range in Julia using Distributions, but I am not able to define the range. I used Truncator constructor to construct the distribution, but I have no idea how to define the range. By density function I mean P(a
Would appreciate any help. The distribution I'm using is Gamma btw!
Thanks
To get the maximum and minimum of the support of distribution d just write maximum(d) and minimum(d) respectively. Note that for some distributions this might be infinity, e.g. maximum(Normal()) is Inf.
What version of Julia and Distributions du you use? In Distribution v0.16.4, it can be easily defined with the second and third arguments of Truncated.
julia> a = Gamma()
Gamma{Float64}(α=1.0, θ=1.0)
julia> b = Truncated(a, 2, 3)
Truncated(Gamma{Float64}(α=1.0, θ=1.0), range=(2.0, 3.0))
julia> p = rand(b, 1000);
julia> extrema(p)
(2.0007680527633305, 2.99864177354943)
You can see the document of Truncated by typing ?Truncated in REPL and enter.

Why do the inverse t-distributions for small values differ in Matlab and R?

I would like to evaluate the inverse Student's t-distribution function for small values, e.g., 1e-18, in Matlab. The degrees of freedom is 2.
Unfortunately, Matlab returns NaN:
tinv(1e-18,2)
NaN
However, if I use R's built-in function:
qt(1e-18,2)
-707106781
The result is sensible. Why can Matlab not evaluate the function for this small value? The Matlab and R results are quite similar to 1e-15, but for smaller values the difference is considerable:
tinv(1e-16,2)/qt(1e-16,2) = 1.05
Does anyone know what is the difference in the implemented algorithms of Matlab and R, and if R gives correct results, how could I effectively calculate the inverse t-distribution, in Matlab, for smaller values?
It appears that R's qt may use a completely different algorithm than Matlab's tinv. I think that you and others should report this deficiency to The MathWorks by filing a service request. By the way, in R2014b and R2015a, -Inf is returned instead of NaN for small values (about eps/8 and less) of the first argument, p. This is more sensible, but I think they should do better.
In the interim, there are several workarounds.
Special Cases
First, in the case of the Student's t-distribution, there are several simple analytic solutions to the inverse CDF or quantile function for certain integer parameters of ν. For your example of ν = 2:
% for v = 2
p = 1e-18;
x = (2*p-1)./sqrt(2*p.*(1-p))
which returns -7.071067811865475e+08. At a minimum, Matlab's tinv should include these special cases (they only do so for ν = 1). It would probably improve the accuracy and speed of these particular solutions as well.
Numeric Inverse
The tinv function is based on the betaincinv function. It appears that it may be this function that is responsible for the loss of precision for small values of the first argument, p. However, as suggested by the OP, one can use the CDF function, tcdf, and root-finding methods to evaluate the inverse CDF numerically. The tcdf function is based on betainc, which doesn't appear to be as sensitive. Using fzero:
p = 1e-18;
v = 2
x = fzero(#(x)tcdf(x,v)-p, 0)
This returns -7.071067811865468e+08. Note that this method is not very robust for values of p close to 1.
Symbolic Solutions
For more general cases, you can take advantage of symbolic math and variable precision arithmetic. You can use identities in terms of Gausian hypergeometric functions, 2F1, as given here for the CDF. Thus, using solve and hypergeom:
% Supposedly valid for or x^2 < v, but appears to work for your example
p = sym('1e-18');
v = sym(2);
syms x
F = 0.5+x*gamma((v+1)/2)*hypergeom([0.5 (v+1)/2],1.5,-x^2/v)/(sqrt(sym('pi')*v)*gamma(v/2));
sol_x = solve(p==F,x);
vpa(sol_x)
The tinv function is based on the betaincinv function. There is no equivalent function or even an incomplete Beta function in the Symbolic Math toolbox or MuPAD, but a similar 2F1 relation for the incomplete Beta function can be used:
p = sym('1e-18');
v = sym(2);
syms x
a = v/2;
F = 1-x^a*hypergeom([a 0.5],a+1,x)/(a*beta(a,0.5));
sol_x = solve(2*abs(p-0.5)==F,x);
sol_x = sign(p-0.5).*sqrt(v.*(1-sol_x)./sol_x);
vpa(sol_x)
Both symbolic schemes return results that agree to -707106781.186547523340184 using the default value of digits.
I've not fully validated the two symbolic methods above so I can't vouch for their correctness in all cases. The code also needs to be vectorized and will be slower than a fully numerical solution.

Resources