MLE of exponential distribution in R - r

If we generate a random vector from the exponential distribution:
exp.seq = rexp(1000, rate=0.10) # mean = 10
Now we want to use the previously generated vector exp.seq to re-estimate lambda
So we define the log likelihood function:
fn <- function(lambda){
length(exp.seq)*log(lambda)-lambda*sum(exp.seq)
}
Now optim or nlm I'm getting very different value for lambda:
optim(lambda, fn) # I get here 3.877233e-67
nlm(fn, lambda) # I get here 9e-07
I used the same technique for the normal distribution and it works fine. So where is the mistake here?
I'm using my own definition for the exponential distribution because I will need to change it later.

Related

Binomial Experiment

How do I use the Binomial function to solve this experiment:
number of trials -> n=18,
p=10%
success x=2
The answer is 28% . I am using Binomial(18, 0.1) but how I pass the n=2?
julia> d=Binomial(18,0.1)
Binomial{Float64}(n=18, p=0.1)
pdf(d,2)
How can I solve this in Julia?
What you want is the Probability Mass Function, aka the probability, that in a binomial experiment of n Bernoulli independent trials with a probability p of success on each individual trial, we obtain exactly x successes.
The way to answer this question in Julia is, using the Distribution package, to first create the "distribution" object with parameters n and p, and then call the function pdf to this object and the variable x:
using Distributions
n = 18 # number of trials in our experiments
p = 0.1 # probability of success of a single trial
x = 2 # number of successes for which we want to compute the probability/PMF
binomialDistribution = Binomial(n,p)
probOfTwoSuccesses = pdf(binomialDistribution,x)
Note that all the other probability related functions (like cdf, quantile, .. but also rand) work in the same way.. you first build the distribution object, that embed the specific distribution parameters, and then you call the function over the distribution object and the variable you are looking for, e.g. quantile(binomialDistribution,0.9) for 90% quantile.

Computing the likelihood of data for Binomial Distribution

I am following the book (Statistical Rethinking) which has code in R and want to reproduce the same in code in Julia. In the book, they compute the likelihood of six successes out of 9 trials where a success, has a probability of 0.5. They achieve this using the following R code.
#R Code
dbinom(6, size = 9, prob=0.5)
#Out > 0.1640625
I am wondering how to do the same in Julia,
#Julia
using Distributions
b = Binomial(9,0.5)
# Its possible to look at random value,
rand(b)
#Out > 5
But how do I look at a specific value such as six successes?
I'm sure you know this but just to be sure the r dbinom function is the probability density (mass) function for the Binomial distribution.
Julia's Distributions package makes use of multiple dispatch to just have one generic pdf function that can be called with any type of Distribution as the first argument, rather than defining a bunch of methods like dbinom, dnorm (for the Normal distribution). So you can do:
julia> using Distributions
julia> b = Binomial(9, 0.5)
Binomial{Float64}(n=9, p=0.5)
julia> pdf(b, 6)
0.1640625000000001
There is also cdf which works in the same way to calculate (maybe unsurprisingly) for the cumulative density function.

how do you generate samples from the logistic CDF using the inverse-CDF method

My question is how to generate a sample in R from a logistic CDF with the inverse CDF method. The logistic density is p(θ) = exp(θ)/(1 + exp(θ))^2
Here is the algorithm for that method:
1: for t = 1 to T do
2: sample q(t) ∼ Unif(0, 1)
3: θ(t) ← F^−1(q(t))
4: end for
Here is my code but it just generates a vector of the same number. The result should be log-concave but obviously it would not be that if I put it in the histogram, so what is the problem?:
First define T as the number of draws you're taking from uniform distribution
T<-100000
sample_q<-runif(T,0,1)
It seems like plogis will give you the cumulative distribution function, so I suppose I can just take its inverse:
generate_samples_from_logistic_CDF <- function(p) {
for(t in length(T))
cdf<-plogis((1+exp(p)/(exp(p))))
inverse_cdf<-(1/cdf)
return(inverse_cdf)
}
should generate_samples_from_logistic_CDF(sample_q)
but instead it only gives me the same value for everything
Since the inverse CDF is already coded in R as qlogis(), this should work:
qlogis(runif(100000))
or if you want to do it "by hand" rather than using the built-in qlogis(), you can use R <- runif(100000); log(R/(1-R))
Note that rlogis(100000) should be more efficient.
One of your confusions is that "inverse" in the algorithm description above doesn't mean the multiplicative inverse or reciprocal (i.e. 1/x), but rather the function inverse (which in this case is log(q/(1-q)))

Generating random variables from the multivariate t-distribution

I wanted to generate random variables from a multivariate t distribution in R. i am using the mvtnorm package which has the command rmvt for generating random variables from the multivariate t-distribution. Now my question is about the syntax of the function and being able to manipulate it to do what I want. The function requires the following
rmvt(n, sigma = diag(2), df = 1, delta = rep(0, nrow(sigma)),
type = c("shifted", "Kshirsagar"), ...)
where sigma is a correlation matrix. Now what I am having trouble with is how to sample from a multivariate t-distribution with mean m and covariance matrix S. Is the following the appropriate syntax?
rmvt(1,S,df=n) + m
or
rmvt(1,R,df=n)*sigma + m
where my covariance matrix can be decomposed as S = sigma*R (i.e., R is my correlation matrix). I am getting different results when I run the two lines of code so that is partially where my confusion stems from.
Have a look at the help file for rmvt. There is says that sigma is the scale (not correlation) matrix and that the correlation matrix, which is only defined for df>2 is given by sigma * df/(df-2). Therefore is you have a pre-specified covariance matrix S then you should set
sigma=S*(D-2)/D
where D is the degrees of freedom. To generate n samples from the multivariate t-distribution with mean m and covariance matrix S you can either add the mean outside the call to rmvt, as you indicated:
rmvt(n, sigma=S*(D-2)/D, df=D) + m
or by using the mu argument:
rmvt(n, mu=m, sigma=S*(D-2)/D, df=D)
Edit: For whatever reason, rmvt is not loading properly on my machine so I have to type this first to have the function loaded properly:
rmvt <- bfp:::rmvt

Log Likelihood using R

I have a probability density function (PDF)
(1-cos(x-theta))/(2*pi)
theta is the unknown parameter. How do I write a log likelihood function for this PDF? I am confused; the x will come from my data, but how do I handle the theta in the equation.
Thanks
You need to use an optimisation or maximisation function in R to compute the value of theta that maximises the log-likelihood. See help(nlmin) for starters.
The function you wrote is a likelihood function of theta given the known x:
ll(theta|x) = log((1-cos(x-theta))/(2*pi))
if you have many iid observations from this distribution, x1,x2,...xn just take the sum of the above:
ll(theta|x1,x2,...) = Sum[log((1-cos(xi-theta))/(2*pi))]
If f(x_i) = (1-cos(x_i-theta))/(2*pi) for observation i, then likelihood function L(Theta)=product(f(x_i)) and logL(theta)=sum(f(x_i)), of course assuming that x_i are independent.
I think log-likelihood only works for normal-distributions. The special property of the log-function is, that it cancels out the exp-function, but here's no exp-function.
Btw., your PDF is periodic and theta just manipulates the phase of that function. Where does this PDF come from? What should it describe?

Resources