By using the library(statmod), I can evaluate a univariate normal distribution using Gauss-Hermite quadrature. But how can I evaluate a bivariate normal distribution using Gauss-Hermite quadrature?
Any help will be appreciated. Thanks in advance.
I have given the codes that I used for evaluating univariate normal using Gauss-Hermite 5 points. But how I can I do for bivariate normal distribution?
library(statmod)
## generating Gauss-Hermite quadrature points and weights
q=gauss.quad(n=5,kind="hermite")
## defining univariate normal function
mu=0
sigma=2
norm=function(b){
M=((2*pi*sigma)^(-1/2))*exp(-(1/2)*(b^2/sigma^2))
return(M)
}
## approximating the integral of norm(b) using Gauss-Hermite method
sum(q$weights*norm(q$nodes)*exp(q$nodes^2))
Here is a minimal worked-through example, based on a bivariate standard normal (probability) density with a given covariance matrix sigma and the quadrature rules implemented in the mvQuad library.
Note that we use the Gauss-Legendre quadrature rule, which allows integration over an arbitrary bounded domain because in the Gauss-Hermite quadrature the domain is unbounded from (-∞, +∞). Since we're working with a bivariate standard normal probability density, the integral over the unbounded domain trivially equals 1. Generally, mvQuad::createNIGrid allows the implementation of various quadrature rules, including the Gauss-Hermite quadrature (see ?createNIGrid for details).
Define a covariance matrix for the a bivariate standard normal probability density
library(mvtnorm)
sigma <- matrix(c(1, 0.2, 0.2, 1), ncol = 2)
dens <- function(x) dmvnorm(x, sigma = sigma)
We are interested in the integral of dens in the domain x ϵ [-1, 2] and y ϵ [-1, 2].
We follow instructions from the mvQuad vignette to create a grid and rescale to the domain of interest
library(mvQuad)
grid <- createNIGrid(dim = 2, type = "GLe", level = 6)
rescale(grid, domain = rbind(c(-1, 2), c(-1, 2)))
Calculate the integral of the bivariate normal in the domain x ϵ [-1, 2] and y ϵ [-1, 2]
quadrature(function(x) dmvnorm(x, sigma = sigma), grid = grid)
#[1] 0.6796583
This value is in good agreement with the value from pmvnorm (which computes the distribution function of the multivariate normal for arbitrary limits and covariance matrices)
pmvnorm(lower = c(-1, -1), upper = c(2, 2), sigma = sigma)
#[1] 0.6796584
#attr(,"error")
#[1] 1e-15
#attr(,"msg")
#[1] "Normal Completion"
Related
Good Evening,
I fitted a four parameter logistic curve using R nls function with the following equation:
y = alpha + lambda/(1+exp(-beta(x-mu))
I would like to determine the maximum slope of this curve and for this I would like to compute the derivative of the function. Do you know how I can find the derivative of this function and use it to determine the maximum slope or the maximum derivative value?
Thank you in advance,
Rohan
I find the regular sigmoid equation y = 1/1+e-x and its derivative but not with the parameters.
I am expecting some help with the derivative of my equation and a piece of script that can help me to find the maximum value.
Let us say that the parameters you have calculated are alpha = 1, lambda = 2, beta = 3 and mu = 4. Then create the derivative function fder and use optimize to find its maximum. Evidently the maximum slope occurs at mu and equals 1.5 or substituting x = mu into the derivative fder we have that the derivative at the maximum equals lambda * beta / 4.
fder <- function(x, alpha = 1, lambda = 2, beta = 3, mu = 4) {}
body(fder) <- D(expression(alpha + lambda/(1+exp(-beta*(x-mu)))), "x")
optimize(fder, c(-10, 10), maximum = TRUE)
## $maximum
## [1] 3.99999
##
## $objective
## [1] 1.5
Starting from #G.Grothendieck's answer, here's a logical explanation of why the maximum derivative is lambda*beta/4.
The maximum derivative of the unscaled logistic function is 1/4, at x=0
The maximum derivative of 1/(1+exp(-beta*x)) is beta/4 at x=0 (you can look this up on Wikipedia
adjusting the midpoint (e.g. 1/(1+exp(-beta*(x-mu)))) shifts the location of the maximum derivative to x=mu but doesn't change its value
shifting the curve up by adding alpha (alpha + 1/(1+exp(-beta*(x-mu)))) doesn't change the max slope or its location
scaling the curve by lambda (alpha + lambda/(1+exp(-beta*(x-mu)))) scales the max derivative by lambda (beta/4 → lambda*beta/4)
Assuming a probability distribution has a density kernel of ,
what Monte Carlo methods can I use to estimate the mean and variance of the distribution?
We can use numerical methods here. First of all, we create a function to represent your probability density function (though this is not yet scaled so that its integral is 1 over its full domain):
pdf <- function(x) x^2 * exp(-x^2/4)
plot(pdf, xlim = c(0, 10))
To get the scaling factor to turn this into a genuine pdf, we can integrate this function over its domain of c(0, Inf).
integrate(pdf, 0, Inf)$value
#> [1] 3.544908
So now we can generate a genuine pdf by dividing our original pdf by this amount:
pdf <- function(x) x^2 * exp(-x^2/4) / 3.544908
plot(pdf, xlim = c(0, 10))
Now that we have a pdf, we can create a cdf with numerical integration:
cdf <- function(x) sapply(x, \(i) integrate(pdf, 0, i)$value)
plot(cdf, xlim = c(0, 10))
The inverse of the cdf is what we need, to be able to convert a sample taken from a uniform distribution between 0 and 1 into a sample drawn from our new distribution. We can create this inverse function using uniroot to find where the output of our cdf matches an arbitrary number between 0 and 1:
inverse_cdf <- function(p)
{
sapply(p, function(i) {
uniroot(function(a) {cdf(a) - i}, c(0, 100))$root
})
}
The inverse cdf looks like this:
plot(inverse_cdf, xlim = c(0, 0.99))
We are now ready to draw a sample from our distribution:
set.seed(1) # Makes this draw reproducible
x_sample <- inverse_cdf(runif(1000))
Now we can plot a histogram of our sample and ensure it matches the pdf:
hist(x_sample, freq = FALSE)
plot(function(x) pdf(x), add = TRUE, xlim = c(0, 6))
Now that we have a sample drawn from x, we can use the sample mean and variance as estimates for the distribution's mean and variance:
mean(x_sample)
#> [1] 2.264438
var(x_sample)
#> [1] 0.9265678
We can increase the accuracy of these estimates by taking larger and larger samples in our call to inverse_cdf(runif(1000)), by increasing the 1000 to a larger number.
Created on 2021-11-06 by the reprex package (v2.0.0)
I have obtained the posterior density for part d: $2 theta^{-1}(1- theta)^{-1}$. How do I plot in R the distribution to find the l and u such that $F_{theta| x} (l) = 0.025$ and $F_{theta| x} (u) = 0.975$? (the equal-tail interval)
Your result is erroneous. By Bayes' theorem, the posterior density is proportional to p(theta)P(X=2|theta) = 1-theta. So we recognize the Beta distribution Beta(1,2). To graph it in R, you can do:
curve(dbeta(x, 1, 2), from = 0, to = 1)
Now the posterior equi-tailed credible interval is given by the quantiles of this distribution. In R:
qbeta(0.025, 1, 2) # lower bound
qbeta(0.975, 1, 2) # upper bound
If you don't know the Beta distribution, you can get these quantiles by elementary calculations. The integral of 1-theta on [0,1] is 1/2. So the posterior density is 2(1-theta) (it must integrate to one). So the posterior cumulative distribution function is 2(theta - theta²/2) = -theta² + 2theta. To get the p-quantile (with p=0.025 and p=0.975), you have to solve the equation -theta² + 2theta = p in theta. This a second-degree polynomial equation, easy to solve.
Finding the central 95% CI is actually easier than finding the 95% HPD. As you have the density (PDF), you also know the CDF. The lower and upper limits of the central 95% CI are given by CDF(l) = 0.025 and CRF(u) = 0.975.
In GAM (and GLM, for that matter), we're fitting a conditional likelihood model. So after fitting the model, for a new input x and response y, I should be able to compute the predictive probability or density of a specific value of y given x. I might want to do this to compare the fit of various models on validation data, for example. Is there a convenient way to do this with a fitted GAM in mgcv? Otherwise, how do I figure out the exact form of the density that is used so I can plug in the parameters appropriately?
As a specific example, consider a negative binomial GAM :
## From ?negbin
library(mgcv)
set.seed(3)
n<-400
dat <- gamSim(1,n=n)
g <- exp(dat$f/5)
## negative binomial data...
dat$y <- rnbinom(g,size=3,mu=g)
## fit with theta estimation...
b <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=nb(),data=dat)
And now I want to compute the predictive probability of, say, y=7, given x=(.1,.2,.3,.4).
Yes. mgcv is doing (empirical) Bayesian estimation, so you can obtain predictive distribution. For your example, here is how.
# prediction on the link (with standard error)
l <- predict(b, newdata = data.frame(x0 = 0.1, x1 = 0.2, x2 = 0.3, x3 = 0.4), se.fit = TRUE)
# Under central limit theory in GLM theory, link value is normally distributed
# for negative binomial with `log` link, the response is log-normal
p.mu <- function (mu) dlnorm(mu, l[[1]], l[[2]])
# joint density of `y` and `mu`
p.y.mu <- function (y, mu) dnbinom(y, size = 3, mu = mu) * p.mu(mu)
# marginal probability (not density as negative binomial is discrete) of `y` (integrating out `mu`)
# I have carefully written this function so it can take vector input
p.y <- function (y) {
scalar.p.y <- function (scalar.y) integrate(p.y.mu, lower = 0, upper = Inf, y = scalar.y)[[1]]
sapply(y, scalar.p.y)
}
Now since you want probability of y = 7, conditional on specified new data, use
p.y(7)
# 0.07810065
In general, this approach by numerical integration is not easy. For example, if other link functions like sqrt() is used for negative binomial, the distribution of response is not that straightforward (though also not difficult to derive).
Now I offer a sampling based approach, or Monte Carlo approach. This is most similar to Bayesian procedure.
N <- 1000 # samples size
set.seed(0)
## draw N samples from posterior of `mu`
sample.mu <- b$family$linkinv(rnorm(N, l[[1]], l[[2]]))
## draw N samples from likelihood `Pr(y|mu)`
sample.y <- rnbinom(1000, size = 3, mu = sample.mu)
## Monte Carlo estimation for `Pr(y = 7)`
mean(sample.y == 7)
# 0.076
Remark 1
Note that as empirical Bayes, all above methods are conditional on estimated smoothing parameters. If you want something like a "full Bayes", set unconditional = TRUE in predict().
Remark 2
Perhaps some people are assuming the solution as simple as this:
mu <- predict(b, newdata = data.frame(x0 = 0.1, x1 = 0.2, x2 = 0.3, x3 = 0.4), type = "response")
dnbinom(7, size = 3, mu = mu)
Such result is conditional on regression coefficients (assumed fixed without uncertainty), thus mu becomes fixed and not random. This is not predictive distribution. Predictive distribution would integrate out uncertainty of model estimation.
I have some basic questions concerning the polyserial() {polycor} function.
Does a p-value exist for rho, or can it be calculated?
For the assumption of a bivariate
normal, is the tested null hypothesis "Yes, bivariate normal"? That is, do I want a high or low p-value.
Thanks.
If you form the returned object with:
polS <- polyserial(x, y, ML=TRUE, std.err=TRUE) # ML estimate
... You should have no difficulty forming a p-value for the hypothesis: rho == 0 using a z-statistic formed by the ratio of a parameter divided by its standard error. But that is not the same as testing the assumption of bivariate normality. For that you need to examine "chisq" component of polS. The print method for objects of class 'polycor' hands that to you in a nice little sentence. You interpret that result in the usual manner: Low p-values are stronger evidence against the null hypothesis (in this case H0: bivariate normality). As a scientist, you do not "want" either result. You want to understand what the data is telling you.
I e-mailed the package author -because I had the same questions) and based on his clarifications, I offer my answers:
First, the easy question: higher p-values (traditionally > 0.05) give you more confidence that the distribution is bivariate normal. Lower p-values indicate a non-normal distribution, BUT, if the sample size is sufficiently large, the maximum likelihood estimate (option ML=TRUE), non-normality doesn't matter; the correlation is still reliable anyway.
Now, for the harder question: to calculate the p-value, you need to:
Execute polyserial with the std.err=TRUE option to have access to more details.
From the resulting polyserial object, access the var[1, 1] element. var is the covariance matrix of the parameter estimates, and sqrt(var[1, 1]) is the standard error (which displays in parentheses in the output after the rho result).
From the standard error, you can calculate the p-value based on the R code below.
Here's some code to illustrate this with copiable R-code, based on the example code in the polyserial documentation:
library(mvtnorm)
library(polycor)
set.seed(12345)
data <- rmvnorm(1000, c(0, 0), matrix(c(1, .5, .5, 1), 2, 2))
x <- data[,1]
y <- data[,2]
y <- cut(y, c(-Inf, -1, .5, 1.5, Inf))
# 2-step estimate
poly_2step <- polyserial(x, y, std.err=TRUE)
poly_2step
##
## Polyserial Correlation, 2-step est. = 0.5085 (0.02413)
## Test of bivariate normality: Chisquare = 8.604, df = 11, p = 0.6584
std.err_2step <- sqrt(poly_2step$var[1, 1])
std.err_2step
## [1] 0.02413489
p_value_2step <- 2 * pnorm(-abs(poly_2step$rho / std.err_2step))
p_value_2step
## [1] 1.529176e-98
# ML estimate
poly_ML <- polyserial(x, y, ML=TRUE, std.err=TRUE)
poly_ML
##
## Polyserial Correlation, ML est. = 0.5083 (0.02466)
## Test of bivariate normality: Chisquare = 8.548, df = 11, p = 0.6635
##
## 1 2 3
## Threshold -0.98560 0.4812 1.50700
## Std.Err. 0.04408 0.0379 0.05847
std.err_ML <- sqrt(poly_ML$var[1, 1])
std.err_ML
## [1] 0.02465517
p_value_ML <- 2 * pnorm(-abs(poly_ML$rho / std.err_ML))
p_value_ML
##
## 1.927146e-94
And to answer an important question that you didn't ask: you would want to always use the maximum likelihood version (ML=TRUE) because it is more accurate, except if you have a really slow computer, in which case the default 2-step approach is acceptable.