I have 2 distributions - 1 beta and 1 normal and I need to find the intersection of their pdfs. I know the parameters for both and am able to visually see the intersection, but am looking for a way for R to calculate the exact point. Anybody have an idea of how to do this?
Use uniroot().
uniroot(function(x) dbeta(x, 1, 2)-dnorm(x, 0, 1), c(0, 1))
## $root
## [1] 0.862456
##
## $f.root
## [1] 5.220165e-05
##
## $iter
## [1] 3
##
## $estim.prec
## [1] 6.103516e-05
This solves an equation dbeta(x, ...) == dnorm(x, ...) w.r.t. x (in the inverval [0,1], as this is the support of a beta distribution), i.e. finds the root of dbeta(x, ...) - dnorm(x, ...). The resulting list's root field gives you the answer (more or less precisely).
Related
Good day to all!
I have a following cubic equation.
Left <- P^3+4*P^2+6*P
Right <- 2
How do I get R to solve for P to get Left = Right?
Thanks in advance.
1. uniroot()
You could use uniroot() to search for a root of a function with respect to its first argument.
uniroot(\(x, y) x^3 + 4*x^2 + 6*x - y, c(0, 1), y = 2, extendInt = "yes")
$root
[1] 0.278161
$f.root
[1] -1.779565e-05
$iter
[1] 6
$init.it
[1] NA
$estim.prec
[1] 6.103516e-05
2. polyroot()
If the function is a real or complex polynomial, you could specifically use polyroot(z), where z is the vector of polynomial coefficients in increasing order.
y <- 2
polyroot(c(-y, 6, 4, 1))
# [1] 0.2781631-0.000000i -2.1390815+1.616897i -2.1390815-1.616897i
Both approaches solve the equation with the root 0.278161. (Besides a real root, polyroot also gives two imaginary roots)
If you want symbolic solutions, I guess you can try Ryacas like below
> library(Ryacas)
> yac_str("Solve(P^3+4*P^2+6*P==2,P)")
[1] "{P==(71/27+Sqrt(187/27))^(1/3)-(Sqrt(187/27)-71/27)^(1/3)-4/3,P==Complex(-(4/3+((71/27+Sqrt(187/27))^(1/3)-(Sqrt(187/27)-71/27)^(1/3))/2),Sqrt(3/4)*((71/27+Sqrt(187/27))^(1/3)+(Sqrt(187/27)-71/27)^(1/3))),P==Complex(-(4/3+((71/27+Sqrt(187/27))^(1/3)-(Sqrt(187/27)-71/27)^(1/3))/2),-Sqrt(3/4)*((71/27+Sqrt(187/27))^(1/3)+(Sqrt(187/27)-71/27)^(1/3)))}"
I want to build a function that takes E[x] and Var[X] and give me the mean and standard error of a univariate lognormal variable.
E[x] = exp(mu + theta)
Var[x] = exp(2*mu + theta)*(exp(theta) - 1)
The function would take E[x] and Var[x] as input and as output would give me theta and mu
There are several packages that provide ways and means to solve a system of nonlinear equations. One of these is nleqslv.
You nee to provide a function that function that returns the differences between the actual value of the equations and the desired value.
Load package nleqslv and define the following function
library(nleqslv)
f <- function(x,Ex,Varx) {
y<- numeric(length(x))
mu <- x[1]
theta <- x[2]
y[1] <- exp(mu+theta) - Ex
y[2] <- exp(2*mu+theta)*(exp(theta)-1) - Varx
y
}
The vector x in the function contains the values of mu and theta.
An example with Ex=2 and Varx=3 and some random starting values
xstart <- c(1,1)
nleqslv(xstart,f,Ex=2,Varx=3)
gives the following
$x
[1] -0.6931472 1.3862944
$fvec
[1] -8.095125e-11 -8.111645e-11
$termcd
[1] 1
$message
[1] "Function criterion near zero"
$scalex
[1] 1 1
$nfcnt
[1] 31
$njcnt
[1] 2
$iter
[1] 22
See the manual of nleqslv for the meaning of the different elements of the return value of nleqslv.
If you want to investigate the effect of the different solving methods try this
testnslv(xstart,f,Ex=2,Varx=3)
I need to compute a division of integrals, where the function q_alpha(z) is the quantile function of a standard normal distribution.
I got a question regarding the denominator. As the normal standard distribution has Homoscedasticity, it is simmetric, continuous, etc.The integration of the denominator term its simple? I just need to elevated to the square each quantile of this function and proceed to the calculation? Right?
This is my code in R:
library(Bolstad)
thau=1:99/100
z.standard.quantile=qnorm(thau,0,1)
z.standard.quantile.square=qnorm(thau,0,1)^2
sintegral(thau[1:50],z.standard.quantile[1:50])$value/sintegral(thau[1:50], z.standard.quantile.square[1:50])$value
The result is: -0.8676396
There is no problem in taking the square of qnorm, but qnorm is unbounded on [0, 0.5] (note qnorm(0) is -Inf) so the integral is not finite.
My second thought is that there is actually no need to use Bolstad::sintegral (Simpson's rule); the R base function integrate is sufficient. Or, we can discretize qnorm and use Trapezoidal rule because qnorm is a smooth function which can be well approximated by linear interpolation.
I will write a function evaluating the ratio of integral in your question, but lower bounded on l:
## using `integrate`
f1 <- function (l) {
a <- integrate(qnorm, lower = l, upper = 0.5)$value
b <- integrate(function (x) qnorm(x) ^ 2, lower = l, upper = 0.5)$value
a / b
}
## using Trapezoidal rule, with `n` division on interval `[l, 0.5]`
f2 <- function (l, n) {
x <- seq(l, 0.5, length = n)
delta <- x[2] - x[1]
y1 <- qnorm(x)
y2 <- y1 ^ 2
a <- sum(y1[-1] + y1[-n]) / 2 * delta
b <- sum(y2[-1] + y2[-n]) / 2 * delta
a / b
}
Those two functions return rather similar result as we can test:
f1 (0.1)
# [1] -1.276167
f2 (0.1, 1000)
# [1] -1.276166
Now, the only thing of interest is the limiting behaviour when l -> 0 (in a numerical sense). Let's try
l <- 10 ^ (- (1:16))
# [1] 1e-01 1e-02 1e-03 1e-04 1e-05 1e-06 1e-07 1e-08 1e-09 1e-10 1e-11 1e-12
# [13] 1e-13 1e-14 1e-15 1e-16
y1 <- sapply(l, f1)
# [1] -1.2761674 -0.8698411 -0.8096179 -0.7996069 -0.7981338 -0.7979341
# [7] -0.7978877 -0.7978848 -0.7978846 -0.7978846 -0.7978846 -0.7978846
# [13] -0.7978846 -0.7978846 -0.7978846 -0.7978846
## quite a dense grid; takes some time to compute
y2 <- sapply(l, f2, n = 1e+6)
# [1] -1.2761674 -0.8698411 -0.8096179 -0.7996071 -0.7981158 -0.7979137
# [7] -0.7978877 -0.7978834 -0.7978816 -0.7978799 -0.7978783 -0.7978767
# [13] -0.7978750 -0.7978734 -0.7978717 -0.7978700
Now, it looks like there is a limit toward around -0.7978 as l -> 0.
Note, the -0.8676396 you got is actually about f1(0.01) or f2(0.01, 1e+6).
I want to find the mode (x-value) of a univariate density function using R
s optimize function
I.e. For a standard normal function f(x) ~ N(3, 1) the mode should be the mean i.e. x=3.
I tried the following:
# Define the function
g <- function(x) dnorm(x = x, mean = 3, sd = 1)
Dvec <- c(-1000, 1000)
# First get the gradient of the function
gradfun <- function(x){grad(g, x)}
# Find the maximum value
x_mode <- optimize(f=g,interval = Dvec, maximum=TRUE)
x_mode
This gives the incorrect value of the mode as:
$maximum
[1] 999.9999
$objective
[1] 0
Which is incorrect i.e. gives the max value of the (-1000, 1000) interval as opposed to x=3.
Could anyone please help edit the optimisation code.
It will be used to pass more generic functions of x if this simple test case works
I would use optim for this, avoiding to mention the interval. You can tailor the seed by taking the maximum of the function on the original guessed interval:
guessedInterval = min(Dvec):max(Dvec)
superStarSeed = guessedInterval[which.max(g(guessedInterval))]
optim(par=superStarSeed, fn=function(y) -g(y))
#$par
#[1] 3
#$value
#[1] -0.3989423
#$counts
#function gradient
# 24 NA
#$convergence
#[1] 0
#$message
#NULL
I am trying to use http://rss.acs.unt.edu/Rdoc/library/stats/html/constrOptim.html in R to do optimization in R with some given linear constraints but not able to figure out how to set up the problem.
For example, I need to maximize $f(x,y) = log(x) + \frac{x^2}{y^2}$ subject to constraints $g_1(x,y) = x+y < 1$, $g_2(x,y) = x > 0$ and $g_3(x,y) = y > 0$. How do I do this in R? This is just a hypothetical example. Do not worry about its structure, instead I am interested to know how to set this up in R.
thanks!
Setting up the function was trivial:
fr <- function(x) { x1 <- x[1]
x2 <- x[2]
-(log(x1) + x1^2/x2^2) # need negative since constrOptim is a minimization routine
}
Setting up the constraint matrix was problematic due to a lack of much documentation, and I resorted to experimentation. The help page says "The feasible region is defined by ui %*% theta - ci >= 0". So I tested and this seemed to "work":
> rbind(c(-1,-1),c(1,0), c(0,1) ) %*% c(0.99,0.001) -c(-1,0, 0)
[,1]
[1,] 0.009
[2,] 0.990
[3,] 0.001
So I put in a row for each constraint/boundary:
constrOptim(c(0.99,0.001), fr, NULL, ui=rbind(c(-1,-1), # the -x-y > -1
c(1,0), # the x > 0
c(0,1) ), # the y > 0
ci=c(-1,0, 0)) # the thresholds
For this problem there is a potential difficulty in that for all values of x the function goes to Inf as y -> 0. I do get a max around x=.95 and y=0 even when I push the starting values out to the "corner", but I'm somewhat suspicious that this is not the true maximum which I would have guessed was in the "corner".
EDIT:
Pursuing this I reasoned that the gradient might provide additional "direction" and added a gradient function:
grr <- function(x) { ## Gradient of 'fr'
x1 <- x[1]
x2 <- x[2]
c(-(1/x[1] + 2 * x[1]/x[2]^2),
2 * x[1]^2 /x[2]^3 )
}
This did "steer" the optimization a bit closer to the c(.999..., 0) corner, instead of moving away from it, as it did for some starting values. I remain somewhat disappointed that the process seems to "head for the cliff" when the starting values are close to the center of the feasible region:
constrOptim(c(0.99,0.001), fr, grr, ui=rbind(c(-1,-1), # the -x-y > -1
c(1,0), # the x > 0
c(0,1) ), # the y > 0
ci=c(-1,0, 0) )
$par
[1] 9.900007e-01 -3.542673e-16
$value
[1] -7.80924e+30
$counts
function gradient
2001 37
$convergence
[1] 11
$message
[1] "Objective function increased at outer iteration 2"
$outer.iterations
[1] 2
$barrier.value
[1] NaN
Note: Hans Werner Borchers posted a better example on R-Help that succeeded in getting the corner values by setting the constraint slightly away from the edge:
> constrOptim(c(0.25,0.25), fr, NULL,
ui=rbind( c(-1,-1), c(1,0), c(0,1) ),
ci=c(-1, 0.0001, 0.0001))
$par
[1] 0.9999 0.0001