I wanted to set a small dataframe in order to plot myself some points of the incomplete elliptic integral of 1st kind for different values of amplitude phi and modulus k. The function to integrate is 1/sqrt(1 - (k*sin(x))^2) between 0 and phi.Here is the code I imagined:
v.phi <- seq(0, 2*pi, 1)
n.phi <- length(v.phi)
v.k <- seq(-1, +1, 0.5)
n.k <- length(v.k)
k <- rep(v.k, each = n.phi, times = 1)
phi <- rep(v.phi, each = 1, times = n.k)
df <- data.frame(k, phi)
func <- function(x, k) 1/sqrt(1 - (k*sin(x))^2)
df$area <- integrate(func,lower=0, upper=df$phi, k=df$k)
But this generates errors and I am obviously mistaking in constructing the new variable df$area... Could someone put me in the right way?
You can use mapply:
df$area <- mapply(function(phi,k){
integrate(func, lower=0, upper=phi, k=k)$value
}, df$phi, df$k)
However that generates an error because there are some values of k equal to 1 or -1, while the allowed values are -1 < k < 1. You can't evaluate this integral for k = +/- 1.
Note that there's a better way to evaluate this integral: the incomplete elliptic function of the first kind is implemented in the gsl package:
> integrate(func, lower=0, upper=6, k=0.5)$value
[1] 6.458877
> gsl::ellint_F(6, 0.5)
[1] 6.458877
As I said, this function is not defined for k=-1 or k=1:
> gsl::ellint_F(6, 1)
[1] NaN
> gsl::ellint_F(6, -1)
[1] NaN
> integrate(func, lower=0, upper=6, k=1)
Error in integrate(func, lower = 0, upper = 6, k = 1) :
non-finite function value
Related
I have the following function.
Let F(.) is the cumulative distribution function of the gamma distribution with shape = 1 and rate =1. The denominator is the survival function S(X) = 1 - F(X). The g(x) is the mean residual life function.
I wrote the following function in r.
x = 5
denominator = 1 -pgamma(x, 1, 1)
numerator = function(t) (1 - pgamma(t, 1, 1))
intnum = integrate(numerator , x, Inf)
frac = intnum$value/denominator
frac
How can I find the maximum of the function g(x) for all possible values of X >= 0? Am I able to do this in r? Thank you very much for your help.
Before start, I defined the function you made
surviveFunction<-function(x){
denominator = 1 -pgamma(x, 1, 1)
numerator = function(t) (1 - pgamma(t, 1, 1))
# I used sapply to get even vector x
intnum = sapply(x,function(x){integrate(numerator , x, Inf)$value})
frac = intnum/denominator
return(frac)
}
Then let's fit our function to function called 'curve' it will draw the plot with continuous data.
The result is shown below:
df = curve(surviveFunction, from=0, to=45)
plot(df, type='l')
And adjust the xlim to find the maximum value
df = curve(surviveFunction, from=0, to=45,xlim = c(30,40))
plot(df, type='l')
And now we can guess the global maximum is located in near 35
I suggest two options to find the global maximum.
First using the df data to find maximum:
> max(df$y,na.rm = TRUE)
1.054248 #maximum value
> df$x[which(df$y==(max(df$y,na.rm = TRUE)))]
35.55 #maximum value of x
Second using the optimize:
> optimize(surviveFunction, interval=c(34, 36), maximum=TRUE)
$maximum
[1] 35.48536
$objective
[1] 1.085282
But the optimize function finds the not the global maximum value i think.
If you see below
optimize(surviveFunction, interval=c(0, 36), maximum=TRUE)
$maximum
[1] 11.11381
$objective
[1] 0.9999887
Above result is not the global maximum I guess it is local maximum.
So, I suggest you using first solution.
I would like to find the root of log(x) = x2 − 2 using uniroot in R
f <- function(x) (log(x)+2-x^2)
uniroot(f, lower=0, upper=100000000)$root
But this shows the error
Error in uniroot(f, lower = 0, upper = 1e+08) : f() values at end
points not of opposite sign
uniroot requires an interval where the function has opposite signs at the two endpoints (since it uses a variation of the bisection method). It isn't a bad idea to do a quick plot when you don't know about just where to look:
f <- function(x) (log(x)+2-x^2)
x <- seq(0.0,4,0.01)
y <- f(x)
plot(x,y,ylim = c(-1,1),type = "l")
abline(h=0)
This yields:
From this you can see that there are two roots, one between 0 and 1, and one between 1 and 2:
uniroot(f,interval = c(0,1))$root #returns 0.1379346
uniroot(f,interval = c(1,2))$root #returns 1.564445
Given a Laplace Distribution proposal:
g(x) = 1/2*e^(-|x|)
and sample size n = 1000, I want to Conduct the Monte Carlo (MC) integration for estimating θ:
via importance sampling. Eventually I want to calculate the mean and standard deviation of this MC estimate in R once I get there.
Edit (arrived late after the answer below)
This is what I have for my R code so far:
library(VGAM)
n = 1000
x = rexp(n,0.5)
hx = mean(2*exp(-sqrt(x))*(sin(x))^2)
gx = rlaplace(n, location = 0, scale = 1)
Now we can write a simple R function to sample from Laplace distribution:
## `n` is sample size
rlaplace <- function (n) {
u <- runif(n, 0, 1)
ifelse(u < 0.5, log(2 * u), -log(2* (1 - u)))
}
Also write a function for density of Laplace distribution:
g <- function (x) ifelse(x < 0, 0.5 * exp(x), 0.5 * exp(-x))
Now, your integrand is:
f <- function (x) {
ifelse(x > 0, exp(-sqrt(x) - 0.5 * x) * sin(x) ^ 2, 0)
}
Now we estimate the integral using 1000 samples (set.seed for reproducibility):
set.seed(0)
x <- rlaplace(1000)
mean(f(x) / g(x))
# [1] 0.2648853
Also compare with numerical integration using quadrature:
integrate(f, lower = 0, upper = Inf)
# 0.2617744 with absolute error < 1.6e-05
Given the following function:
f(x) = (1/2*pi)(1/(1+x^2/4))
How do I identify it's distribution and write this distribution function in R?
So this is your function right now (hopefully you know how to write an R function; if not, check writing your own function):
f <- function (x) (pi / 2) * (1 / (1 + 0.25 * x ^ 2))
f is defined on (-Inf, Inf) so integration on this range gives an indefinite integral. Fortunately, it approaches to Inf at the speed of x ^ (-2), so the integral is well defined, and can be computed:
C <- integrate(f, -Inf, Inf)
# 9.869604 with absolute error < 1e-09
C <- C$value ## extract integral value
# [1] 9.869604
Then you want to normalize f, as we know that a probability density should integrate to 1:
f <- function (x) (pi / 2) * (1 / (1 + 0.25 * x ^ 2)) / C
You can draw its density by:
curve(f, from = -10, to = 10)
Now that I have the probably distribution function I was wondering how to create a random sample of say n = 1000 using this new distribution function?
An off-topic question, but OK to answer without your making a new thread. Useful as it turns out subtle.
Compare
set.seed(0); range(simf(1000, 1e-2))
#[1] -56.37246 63.21080
set.seed(0); range(simf(1000, 1e-3))
#[1] -275.3465 595.3771
set.seed(0); range(simf(1000, 1e-4))
#[1] -450.0979 3758.2528
set.seed(0); range(simf(1000, 1e-5))
#[1] -480.5991 8017.3802
So I think e = 1e-2 is reasonable. We could draw samples, make a (scaled) histogram and overlay density curve:
set.seed(0); x <- simf(1000)
hist(x, prob = TRUE, breaks = 50, ylim = c(0, 0.16))
curve(f, add = TRUE, col = 2, lwd = 2, n = 201)
I want to use the mle function to get estimates of a and b in a Unif(a,b) distribution. But I get absurd estimates nowhere close to 1 and 3.
library(stats4)
set.seed(20161208)
N <- 100
c <- runif(N, 1, 3)
LL <- function(min, max) {
R <- runif(100, min, max)
suppressWarnings((-sum(log(R))))
}
mle(minuslogl = LL, start = list(min = 1, max = 3), method = "BFGS",
lower = c(-Inf, 0), upper = c(Inf, Inf))
I got:
Call:
mle(minuslogl = LL, start = list(min = 1, max = 3), method = "BFGS")
Coefficients:
min max
150.8114 503.6586
Any ideas of what's going on? Thank you in advance!
I would first point out where your code is wrong.
You need dunif not runif. You may define:
LL <- function (a, b) -sum(dunif(x, a, b, log.p = TRUE))
In my code below I did not use dunif, as the density is just 1 / (b - a) so I wrote it directly.
You are generating samples inside objective function. For U[a,b] this is OK as its density is free of x. But for other distributions the objective function changes at each iteration.
With box constraints, you need method = "L-BFGS-B", not the ordinary "BFGS". And you are not using the right constraints.
Now in more depth...
For a length-n sample vector x from U[a, b], the likelihood is (b - a) ^ (-n), and negative-log-likelihood is n * log(b - a). Obviously the MLE are a = min(x) and b = max(x).
Numerical optimization is completely unnecessary, and is in fact impossible without constraints. Look at the gradient vector:
( n / (a - b), n / (b - a) )
The partial derivative w.r.t. a / b is always negative / positive and can't be 0.
Numerical approach becomes feasible when we impose box constraints: -Inf < a <= min(x) and max(x) <= b < Inf. We know for sure that iteration terminates at the boundary.
My code below uses both optim and mle. Note mle will fail, when it inverts Hessian matrix, as it is singular:
-(b - a) ^ 2 (b - a) ^ 2
(b - a) ^ 2 -(b - a) ^ 2
Code:
## 100 samples
set.seed(20161208); x <- runif(100, 1, 3)
# range(x)
# [1] 1.026776 2.984544
## using `optim`
nll <- function (par) log(par[2] - par[1]) ## objective function
gr_nll <- function (par) c(-1, 1) / diff(par) ## gradient function
optim(par = c(0,4), fn = nll, gr = gr_nll, method = "L-BFGS-B",
lower = c(-Inf, max(x)), upper = c(min(x), Inf), hessian = TRUE)
#$par
#[1] 1.026776 2.984544 ## <- reaches boundary!
#
# ...
#
#$hessian ## <- indeed singular!!
# [,1] [,2]
#[1,] -0.2609022 0.2609022
#[2,] 0.2609022 -0.2609022
## using `stats4::mle`
library(stats4)
nll. <- function (a, b) log(b - a)
mle(minuslogl = nll., start = list(a = 0, b = 4), method = "L-BFGS-B",
lower = c(-Inf, max(x)), upper = c(min(x), Inf))
#Error in solve.default(oout$hessian) :
# Lapack routine dgesv: system is exactly singular: U[2,2] = 0