How to create a distribution function in R?

How to create a distribution function in R? - r

Given the following function:
f(x) = (1/2*pi)(1/(1+x^2/4))
How do I identify it's distribution and write this distribution function in R?

So this is your function right now (hopefully you know how to write an R function; if not, check writing your own function):
f <- function (x) (pi / 2) * (1 / (1 + 0.25 * x ^ 2))
f is defined on (-Inf, Inf) so integration on this range gives an indefinite integral. Fortunately, it approaches to Inf at the speed of x ^ (-2), so the integral is well defined, and can be computed:
C <- integrate(f, -Inf, Inf)
# 9.869604 with absolute error < 1e-09
C <- C$value ## extract integral value
# [1] 9.869604
Then you want to normalize f, as we know that a probability density should integrate to 1:
f <- function (x) (pi / 2) * (1 / (1 + 0.25 * x ^ 2)) / C
You can draw its density by:
curve(f, from = -10, to = 10)

Now that I have the probably distribution function I was wondering how to create a random sample of say n = 1000 using this new distribution function?
An off-topic question, but OK to answer without your making a new thread. Useful as it turns out subtle.
Compare
set.seed(0); range(simf(1000, 1e-2))
#[1] -56.37246 63.21080
set.seed(0); range(simf(1000, 1e-3))
#[1] -275.3465 595.3771
set.seed(0); range(simf(1000, 1e-4))
#[1] -450.0979 3758.2528
set.seed(0); range(simf(1000, 1e-5))
#[1] -480.5991 8017.3802
So I think e = 1e-2 is reasonable. We could draw samples, make a (scaled) histogram and overlay density curve:
set.seed(0); x <- simf(1000)
hist(x, prob = TRUE, breaks = 50, ylim = c(0, 0.16))
curve(f, add = TRUE, col = 2, lwd = 2, n = 201)

Related

Find the maximum of the function in R

I have the following function.
Let F(.) is the cumulative distribution function of the gamma distribution with shape = 1 and rate =1. The denominator is the survival function S(X) = 1 - F(X). The g(x) is the mean residual life function.
I wrote the following function in r.
x = 5
denominator = 1 -pgamma(x, 1, 1)
numerator = function(t) (1 - pgamma(t, 1, 1))
intnum = integrate(numerator , x, Inf)
frac = intnum$value/denominator
frac
How can I find the maximum of the function g(x) for all possible values of X >= 0? Am I able to do this in r? Thank you very much for your help.

Before start, I defined the function you made
surviveFunction<-function(x){
denominator = 1 -pgamma(x, 1, 1)
numerator = function(t) (1 - pgamma(t, 1, 1))
# I used sapply to get even vector x
intnum = sapply(x,function(x){integrate(numerator , x, Inf)$value})
frac = intnum/denominator
return(frac)
}
Then let's fit our function to function called 'curve' it will draw the plot with continuous data.
The result is shown below:
df = curve(surviveFunction, from=0, to=45)
plot(df, type='l')
And adjust the xlim to find the maximum value
df = curve(surviveFunction, from=0, to=45,xlim = c(30,40))
plot(df, type='l')
And now we can guess the global maximum is located in near 35
I suggest two options to find the global maximum.
First using the df data to find maximum:
> max(df$y,na.rm = TRUE)
1.054248 #maximum value
> df$x[which(df$y==(max(df$y,na.rm = TRUE)))]
35.55 #maximum value of x
Second using the optimize:
> optimize(surviveFunction, interval=c(34, 36), maximum=TRUE)
$maximum
[1] 35.48536
$objective
[1] 1.085282
But the optimize function finds the not the global maximum value i think.
If you see below
optimize(surviveFunction, interval=c(0, 36), maximum=TRUE)
$maximum
[1] 11.11381
$objective
[1] 0.9999887
Above result is not the global maximum I guess it is local maximum.
So, I suggest you using first solution.

Plotting incomplete elliptic integral of 1st kind

I wanted to set a small dataframe in order to plot myself some points of the incomplete elliptic integral of 1st kind for different values of amplitude phi and modulus k. The function to integrate is 1/sqrt(1 - (k*sin(x))^2) between 0 and phi.Here is the code I imagined:
v.phi <- seq(0, 2*pi, 1)
n.phi <- length(v.phi)
v.k <- seq(-1, +1, 0.5)
n.k <- length(v.k)
k <- rep(v.k, each = n.phi, times = 1)
phi <- rep(v.phi, each = 1, times = n.k)
df <- data.frame(k, phi)
func <- function(x, k) 1/sqrt(1 - (k*sin(x))^2)
df$area <- integrate(func,lower=0, upper=df$phi, k=df$k)
But this generates errors and I am obviously mistaking in constructing the new variable df$area... Could someone put me in the right way?

You can use mapply:
df$area <- mapply(function(phi,k){
integrate(func, lower=0, upper=phi, k=k)$value
}, df$phi, df$k)
However that generates an error because there are some values of k equal to 1 or -1, while the allowed values are -1 < k < 1. You can't evaluate this integral for k = +/- 1.
Note that there's a better way to evaluate this integral: the incomplete elliptic function of the first kind is implemented in the gsl package:
> integrate(func, lower=0, upper=6, k=0.5)$value
[1] 6.458877
> gsl::ellint_F(6, 0.5)
[1] 6.458877
As I said, this function is not defined for k=-1 or k=1:
> gsl::ellint_F(6, 1)
[1] NaN
> gsl::ellint_F(6, -1)
[1] NaN
> integrate(func, lower=0, upper=6, k=1)
Error in integrate(func, lower = 0, upper = 6, k = 1) :
non-finite function value

How to integrate the product of two functions

Suppose I am seeking to integrate the following function from 0 to 10:
How would I accomplish this in R?
Functions
# Functional form
fn <- function(t) -100*(t)^2 + 20000
# First derivative w.r.t. t
fn_dt <- function(t) -200*t
# Density funciton phi
phi <- approxfun(density(rnorm(35, 15, 7)))
# Delta t
delta <- 5

How about the following:
First off, we choose a fixed seed for reproducibility.
# Density funciton phi
set.seed(2017);
phi <- approxfun(density(rnorm(35, 15, 7)))
We define the integrand.
integrand <- function(x) {
f1 <- -500 * x^2 + 100000;
f2 <- phi(x);
f2[is.na(f2)] <- 0;
return(f1 * f2)
}
By default, approxfun returns NA if x falls outside the interval [min(x), max(x)]; since phi is based on the density of a normal distribution, we can replace NAs with 0.
Let's plot the integrand
library(ggplot2);
ggplot(data.frame(x = 0), aes(x)) + stat_function(fun = integrand) + xlim(-50, 50);
We use integrate to calculate the integral; here I assume you are interested in the interval [-Inf, +Inf].
integrate(integrand, lower = -Inf, upper = Inf)
#-39323.06 with absolute error < 4.6

Solving (determining) a function at a point in R

In my below R code, I was wondering how I could find out what is rh1 when y == 0.5?
Note that y uses atanh(rh1), which can be converted back to rh1 using tanh().
rh1 <- seq(-1, 0.1, by = 0.001)
y <- pnorm(-0.13, atanh(rh1), 0.2)
plot(rh1, y, type = "l")

Analytical solution
For a normal distribution X ~ N(mu, 0.2). We want to find mu, such that Pr (X < -0.13) = y.
Recall your previous question and my answer over there: Determine a normal distribution given its quantile information. Here we have something simpler, as there is only one unknown parameter and one piece of quantile information.
Again, we start by standardization:
Pr {X < -0.13} = y
=> Pr { [(X - mu) / 0.2] < [(-0.13 - mu) / 0.2] } = y
=> Pr { Z < [(-0.13 - mu) / 0.2] } = y # Z ~ N(0,1)
=> (-0.13 - mu) / 0.2 = qnorm (y)
=> mu = -0.13 - 0.2 * qnorm (y)
Now, let atanh(rh1) = mu => rh1 = tanh(mu), so in short, the analytical solution is:
tanh( -0.13 - 0.2 * qnorm (y) )
Numerical solution
It is a root finding problem. We first build the following function f, and we aim to find its root, i.e., the rh1 so that f(rh1) = 0.
f <- function (rh1, y) pnorm(-0.13, atanh(rh1), 0.2) - y
The simplest root finding method is bisection method, implemented by uniroot in R. I recommend you reading Uniroot solution in R for how we should work with it in general.
curve(f(x, 0.5), from = -1, to = 0.1); abline (h = 0, lty = 2)
We see there is a root between (-0.2, 0), so:
uniroot(f, c(-0.2, 0), y = 0.5)$root
# [1] -0.129243

Your function is monotonic so you can just create the inverse function.
rh1 <- seq(-1,.1,by=.001)
y <- pnorm(-.13,atanh(rh1),.2)
InverseFun = approxfun(y, rh1)
InverseFun(0.5)
[1] -0.1292726

Monte Carlo integration using importance sampling given a proposal function

Given a Laplace Distribution proposal:
g(x) = 1/2*e^(-|x|)
and sample size n = 1000, I want to Conduct the Monte Carlo (MC) integration for estimating θ:
via importance sampling. Eventually I want to calculate the mean and standard deviation of this MC estimate in R once I get there.
Edit (arrived late after the answer below)
This is what I have for my R code so far:
library(VGAM)
n = 1000
x = rexp(n,0.5)
hx = mean(2*exp(-sqrt(x))*(sin(x))^2)
gx = rlaplace(n, location = 0, scale = 1)

Now we can write a simple R function to sample from Laplace distribution:
## `n` is sample size
rlaplace <- function (n) {
u <- runif(n, 0, 1)
ifelse(u < 0.5, log(2 * u), -log(2* (1 - u)))
}
Also write a function for density of Laplace distribution:
g <- function (x) ifelse(x < 0, 0.5 * exp(x), 0.5 * exp(-x))
Now, your integrand is:
f <- function (x) {
ifelse(x > 0, exp(-sqrt(x) - 0.5 * x) * sin(x) ^ 2, 0)
}
Now we estimate the integral using 1000 samples (set.seed for reproducibility):
set.seed(0)
x <- rlaplace(1000)
mean(f(x) / g(x))
# [1] 0.2648853
Also compare with numerical integration using quadrature:
integrate(f, lower = 0, upper = Inf)
# 0.2617744 with absolute error < 1.6e-05

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to create a distribution function in R? - r

Given the following function: f(x) = (1/2*pi)(1/(1+x^2/4)) How do I identify it's distribution and write this distribution function in R?

Related

Find the maximum of the function in R

Plotting incomplete elliptic integral of 1st kind

How to integrate the product of two functions

Solving (determining) a function at a point in R

Monte Carlo integration using importance sampling given a proposal function

Categories

Resources