Im doing some maximum likelihood in R. I have a dataset c(3,33,12,22,23) which corresponds to how long time a battery will last. The time a battery will last is Exp(theta) distributed. Im supposed to plot the Maximum likelihood function. I have this:
liklihood <- function(theta, n){
rpois(theta,n)
}
ac <- c(3,33,12,22,23)
theta <- seq(from=0, to=30, length=10)
plot(theta, liklihood(theta, n=length(ac)), type="l",xlab=expression(theta),
ylab=expression(L(theta)), col="blue",
main="1 (b)")
but it does not look right, can anyone please help we if you see what I do incorrect.
You should check the definition of the likelihood. Likelihood is product of densities of random sample and it is function of parameter. Your function liklihood samples (which is wrong) from Poisson distribution with probably wrong parameters - check ?rpois (first parameter is sample size and second is lambda).
For Poisson distribution:
likelihood <- function(theta, x) prod(dpois(x, theta))
log_likelihood <- function(theta, x) sum(log(dpois(x, theta)))
Now you want to calculate likelihood for different values of theta keeping vector x same (in your case that is vector ac):
theta <- seq(0, 30, by = 0.5)
plot(theta, unlist(lapply(theta, likelihood, x = ac)))
For exponential distribution change dpois with dexp but be aware that mean of exponential is 1/rate.
liklihood1 <- function(theta, x){
prod(dexp(n, 1/theta))
}
ac <- c(3,33,12,22,23)
theta <- seq(0, 100, by = 1)
plot(theta, unlist(lapply(theta, liklihood1, x =
ac)),type="l",xlab=expression(theta), ylab=expression(L(theta)),
col="blue", main="1 (b)")
is this correct for an exponential distribution?
Related
https://i.stack.imgur.com/tHkYK.png
f(x) = 1/2*e^-|x-2|, −∞ < x < ∞
Write a function to generate random sample from the density function above using Inverse
Transformation method by taking sample size n = 10^4. Please take θ parameter as 2. After generating random sample do not forget to plot the sample and check sample statistics.
theta <- 2
n <- 10^4
set.seed(361)
u <- runif(n)
x <- theta + log(2*u)
hist(x, prob = TRUE, main ="Pareto(9,2)")
y <- seq(1, 40, 0.01)
lines(y, (1/2)*(exp(1)^(theta-y)),col="red")
I have a list of data
frequency x1,x2,...,xn
i.e. 10,20,...,5000.
Intensity y1,yx,...,yn
0,0,...,50,60,50,...,0
where I want to fit a normal distribution to the data.
I found some website online such as (http://www.di.fc.ul.pt/~jpn/r/distributions/fitting.html) through the procedure like,
my_data <- rnorm(250, mean=1, sd=0.45)# unkonwn distribution parameters
fit <- fitdistr(my_data, densfun="normal")
but obviously, those methods won't work.
How to fit the above data to a normal distribution?
You can use the maximum likelihood function, mle, to solve this problem. Here is how you would do that:
my_data <- rnorm(250, mean=1, sd=0.45)# unkonwn distribution parameters
logLik <- function(sigma, mu){
ll <- vapply(my_data,
function(x) dnorm(x, mean = mu, sd = sigma),
FUN.VALUE = numeric(1))
-sum(log(ll))
}
mle(logLik, start = list(sigma = 1, mu = 1))
mle requires a log-likehood function that it uses to determine the optimal parameters (which in the case of a normal distribution are mu (mean) and sigma (st. dev.)). It takes the negative sum of the log-likelihood -sum(log(ll)) as part of a numerical procedure to find the best parameters for the distribution. It then returns the estimated parameters:
Call:
mle(minuslogl = logLik, start = list(sigma = 1, mu = 1))
Coefficients:
sigma mu
0.4595003 0.9724402
Is it possible to differentiate an ECDF? Take the one obtained in the following for example example.
set.seed(1)
a <- sort(rnorm(100))
b <- ecdf(a)
plot(b)
I would like to take the derivative of b in order to obtain its probability density function (PDF).
n <- length(a) ## `a` must be sorted in non-decreasing order already
plot(a, 1:n / n, type = "s") ## "staircase" plot; not "line" plot
However I'm looking to find the derivative of b
In samples-based statistics, estimated density (for a continuous random variable) is not obtained from ECDF by differentiation, because the sample size is finite and and ECDF is not differentiable. Instead, we estimate the density directly. I guess plot(density(a)) is what you are really looking for.
a few days later..
Warning: the following is just a numerical solution without statistical ground!
I take it as an exercise to learn about R package scam for shape constrained additive models, a child package of mgcv by Prof Wood's early PhD student Dr Pya.
The logic is as such:
using scam::scam, fit a monotonically increasing P-spline to ECDF (you have to specify how many knots you want); [Note that monotonicity is not the only theoretical constraint. It is required that the smoothed ECDF are "clipped" on its two edges: the left edge at 0 and the right edge at 1. I am currently using weights to impose such constraint, by giving very large weight at two edges]
using stats::splinefun, reparametrize the fitted spline with a monotonic interpolation spline through knots and predicted values at knots;
return the interpolation spline function, which can also evaluate the 1st, 2nd and 3rd derivatives.
Why I expect this to work:
As sample size grows,
ECDF converges to CDF;
P-spline is consistent so a smoothed ECDF will be increasingly unbiased for ECDF;
the 1st derivative of smoothed ECDF will be increasingly unbiased for PDF.
Use with caution:
You have to choose number of knots yourself;
the derivative is NOT normalized so that the area under the curve is 1;
the result can be rather unstable, and is only good for large sample size.
function arguments:
x: a vector of samples;
n.knots: number of knots;
n.cells: number of grid points when plotting derivative function
You need to install scam package from CRAN.
library(scam)
test <- function (x, n.knots, n.cells) {
## get ECDF
n <- length(x)
x <- sort(x)
y <- 1:n / n
dat <- data.frame(x = x, y = y) ## make sure `scam` can find `x` and `y`
## fit a monotonically increasing P-spline for ECDF
fit <- scam::scam(y ~ s(x, bs = "mpi", k = n.knots), data = dat,
weights = c(n, rep(1, n - 2), 10 * n))
## interior knots
xk <- with(fit$smooth[[1]], knots[4:(length(knots) - 3)])
## spline values at interior knots
yk <- predict(fit, newdata = data.frame(x = xk))
## reparametrization into a monotone interpolation spline
f <- stats::splinefun(xk, yk, "hyman")
par(mfrow = c(1, 2))
plot(x, y, pch = 19, col = "gray") ## ECDF
lines(x, f(x), type = "l") ## smoothed ECDF
title(paste0("number of knots: ", n.knots,
"\neffective degree of freedom: ", round(sum(fit$edf), 2)),
cex.main = 0.8)
xg <- seq(min(x), max(x), length = n.cells)
plot(xg, f(xg, 1), type = "l") ## density estimated by scam
lines(stats::density(x), col = 2) ## a proper density estimate by density
## return smooth ECDF function
f
}
## try large sample size
set.seed(1)
x <- rnorm(1000)
f <- test(x, n.knots = 20, n.cells = 100)
f is a function as returned by stats::splinefun (read ?splinefun).
A naive, similar solution is to do interpolation spline on ECDF without smoothing. But this is a very bad idea, as we have no consistency.
g <- splinefun(sort(x), 1:length(x) / length(x), method = "hyman")
curve(g(x, deriv = 1), from = -3, to = 3)
A reminder: it is highly recommended to use stats::density for a direct density estimation.
I need to fit x-y data with a model, which is non-analytic. I have a function f(x) that calculates the model for each x numerically, but there is no analytical equation. For the fit, I use optim in R. I minimise RMS between the model and the data. It works well and returns reasonable parameters.
I would like to find confidence intervals (or at least standard errors) on the best-fitting parameters. I found on internet that this can be done from the Hessian matrix, but only if maximising log-likelihood function. I don't know how to do this, all I have is x, y and f(x) from which I find RMS. Alas, I have no good way of estimating errors on y.
How can I find confidence intervals on my fit parameters?
Edit: perhaps an example in R might help explaining what I'm asking for. This example uses a simple analytic function to fit the data, in my real case the function is non-analytic, so I cannot use, e.g., nls.
set.seed(666)
# generate data
x <- seq(100) / 100
y <- 0.5 * x + rnorm(100, sd = 0.03) + 0.2
# function to fit
f <- function(x, a, b) {
a * x + b
}
# error function to minimise: RMS
errfun <- function(par, x, y) {
a <- par[1]
b <- par[2]
err <- sqrt(sum((f(x, a, b) - y)^2))
}
# use optim to fit the model to the data
par <- c(1, 0)
res <- optim(par, errfun, gr=NULL, x, y)
# best-fitting parameters
best_a <- res$par[1]
best_b <- res$par[2]
The best fitting parameters are a = 0.50 and b = 0.20. I need to find 95% confidence intervals on these.
This is a job for the bootstrap:
(1) create a large number of synthetic datasets x*. These are created by sampling from x with replacement the same number of data as were in x. For example, if your data is (1,2,3,4,5,6) an x* might be (5,2,4,4,2,3) (note that values might appear multiple times, or not at all because we are sampling with replacement)
(2) For each x*, calculate f(x*). If there are other parameters which don't depend on the data, don't change them. (so f(x,a,b,c) becomes f(x*,a,b,c) as long as a,b,c don't depend on x. Call these quantities f*.
(3) You can estimate anything you want from these f*. If you want the standard deviation of f(x), take the standard deviation of f*. If you want the 95% confidence interval, take the range from the 2.5 to the 97.5 percentiles of f*. More formally, if you want to estimate g(f(x)) you estimate it as g(f(x*)).
I should say this is a very practically-oriented explanation of the bootstrap. I have glossed over many theoretical details, but the bootstrap is near-universally applicable (basically as long as the thing you are trying to estimate actually exists, you are usually okay).
To apply this to the example you have given in your code:
x <- seq(100) / 100
y <- 0.5 * x + rnorm(100, sd = 0.03) + 0.2
# function to fit
f <- function(x, a, b) {
a * x + b
}
# error function to minimise: RMS
errfun <- function(par, x, y) {
a <- par[1]
b <- par[2]
err <- sqrt(sum((f(x, a, b) - y)^2))
}
# this is the part where we bootstrap
# use optim to fit the model to the data
best_a <- best_b <- numeric(10000)
for(i in 1:10000){
j <- sample(100,replace=TRUE)
x.boot <- x[j]; y.boot <- y[j]
par <- c(1, 0)
res <- optim(par, errfun, gr=NULL, x.boot, y.boot)
# best-fitting parameters
best_a[i] <- res$par[1]
best_b[i] <- res$par[2]
}
# now, we look at the *vector* best_a
# for example, if you want the standard deviation of a,
sd(best_a)
# or a 95% confidence interval for b,
quantile(best_b,c(0.025,0.975))
So I used the rgl package and created my own likelihood function to output the log likelihood of a sample from a normal distribution. I was doing this really just to learn how to program this myself so I could better understand how likelihood works and also how MLE works. Anyways, I noticed something particularly strange and I wanted to know if someone knew the answer here. When I plot the graph, it comes out in a folded curve shape, but I supposed I was expecting more of a cone type shape. Basically, what im curious about is why when the plot peaks at the sigma^2 value (on this axis, there is a good decline on both sides of the peak), the mu value stays roughly the same? It's as if once the sigma^2 parameter has reached the optimal level, the differences in likelihood between mu values are pretty small. For example, when I check the variance of the likelihoods of the maximum point of sigma (keeping it constant), it's 11.5. In contrast, when I check the variance of the mu's across that same point, the variance is 23402. Since I can't yet post images since I don't have enough reputation, I will just post my R-code that produces the graph.
#Define LL function
LL <- function(X, theta)
{
mu <- theta[1]
sigma2 <- theta[2]
log.likelihood <- 0
n <- length(X)
for (i in 1:length(X))
{
log.likelihood <- log.likelihood - (((X[i]-mu)^2)/(2*sigma2)) -
log(sqrt(2*pi*sigma2))
}
return(log.likelihood)
}
#Parameters
Mu <- 100
Sigma2 <- 50
#Sample
N <- 100
set.seed(1)
IQs <- rnorm(N, mean=Mu, sd=sqrt(Sigma2))
#Possible values to test
x <- posMu <- seq(80, 120, length.out=200)
y <- posSig <- seq(20, 60, length.out=200)
#x1 <- sort(x, decreasing=T)
#Produce LLs for plotting
LLlist <- NULL
for (m in 1:length(posMu)){
LLs <- NULL
for(s in 1:length(posSig)){
posTheta <- cbind(posMu[m],posSig[s])
LLs <- c(LLs, LL(IQs,posTheta))
}
LLlist <- cbind(LLlist,LLs, deparse.level=0)
}
z <- LLlist
#Find the approximate MLE
mLL <- which(LLlist == max(LLlist), arr.ind=TRUE)
cbind(posMu[mLL[2]],posSig[mLL[1]],LLlist[mLL])
#Graph the LLs
library(rgl)
open3d()
plot3d(mean(x),mean(y),mean(z), xlab="Mu", ylab="Sigma2", zlab="log L", xlim=c(min(x),max(x)), ylim=c(min(y),max(y)), zlim=c(min(z),max(z)))
surface3d(x, y, z, color=rainbow(length(x)))
So, is my code just wrong? Or is this what a LL curve should look like? If so, why is it that sigma^2 seems to show a clear curve and height whereas mu hardly differs at the maximum? Thanks in advance!