Good Evening,
I fitted a four parameter logistic curve using R nls function with the following equation:
y = alpha + lambda/(1+exp(-beta(x-mu))
I would like to determine the maximum slope of this curve and for this I would like to compute the derivative of the function. Do you know how I can find the derivative of this function and use it to determine the maximum slope or the maximum derivative value?
Thank you in advance,
Rohan
I find the regular sigmoid equation y = 1/1+e-x and its derivative but not with the parameters.
I am expecting some help with the derivative of my equation and a piece of script that can help me to find the maximum value.
Let us say that the parameters you have calculated are alpha = 1, lambda = 2, beta = 3 and mu = 4. Then create the derivative function fder and use optimize to find its maximum. Evidently the maximum slope occurs at mu and equals 1.5 or substituting x = mu into the derivative fder we have that the derivative at the maximum equals lambda * beta / 4.
fder <- function(x, alpha = 1, lambda = 2, beta = 3, mu = 4) {}
body(fder) <- D(expression(alpha + lambda/(1+exp(-beta*(x-mu)))), "x")
optimize(fder, c(-10, 10), maximum = TRUE)
## $maximum
## [1] 3.99999
##
## $objective
## [1] 1.5
Starting from #G.Grothendieck's answer, here's a logical explanation of why the maximum derivative is lambda*beta/4.
The maximum derivative of the unscaled logistic function is 1/4, at x=0
The maximum derivative of 1/(1+exp(-beta*x)) is beta/4 at x=0 (you can look this up on Wikipedia
adjusting the midpoint (e.g. 1/(1+exp(-beta*(x-mu)))) shifts the location of the maximum derivative to x=mu but doesn't change its value
shifting the curve up by adding alpha (alpha + 1/(1+exp(-beta*(x-mu)))) doesn't change the max slope or its location
scaling the curve by lambda (alpha + lambda/(1+exp(-beta*(x-mu)))) scales the max derivative by lambda (beta/4 → lambda*beta/4)
Related
I have obtained the posterior density for part d: $2 theta^{-1}(1- theta)^{-1}$. How do I plot in R the distribution to find the l and u such that $F_{theta| x} (l) = 0.025$ and $F_{theta| x} (u) = 0.975$? (the equal-tail interval)
Your result is erroneous. By Bayes' theorem, the posterior density is proportional to p(theta)P(X=2|theta) = 1-theta. So we recognize the Beta distribution Beta(1,2). To graph it in R, you can do:
curve(dbeta(x, 1, 2), from = 0, to = 1)
Now the posterior equi-tailed credible interval is given by the quantiles of this distribution. In R:
qbeta(0.025, 1, 2) # lower bound
qbeta(0.975, 1, 2) # upper bound
If you don't know the Beta distribution, you can get these quantiles by elementary calculations. The integral of 1-theta on [0,1] is 1/2. So the posterior density is 2(1-theta) (it must integrate to one). So the posterior cumulative distribution function is 2(theta - theta²/2) = -theta² + 2theta. To get the p-quantile (with p=0.025 and p=0.975), you have to solve the equation -theta² + 2theta = p in theta. This a second-degree polynomial equation, easy to solve.
Finding the central 95% CI is actually easier than finding the 95% HPD. As you have the density (PDF), you also know the CDF. The lower and upper limits of the central 95% CI are given by CDF(l) = 0.025 and CRF(u) = 0.975.
By using the library(statmod), I can evaluate a univariate normal distribution using Gauss-Hermite quadrature. But how can I evaluate a bivariate normal distribution using Gauss-Hermite quadrature?
Any help will be appreciated. Thanks in advance.
I have given the codes that I used for evaluating univariate normal using Gauss-Hermite 5 points. But how I can I do for bivariate normal distribution?
library(statmod)
## generating Gauss-Hermite quadrature points and weights
q=gauss.quad(n=5,kind="hermite")
## defining univariate normal function
mu=0
sigma=2
norm=function(b){
M=((2*pi*sigma)^(-1/2))*exp(-(1/2)*(b^2/sigma^2))
return(M)
}
## approximating the integral of norm(b) using Gauss-Hermite method
sum(q$weights*norm(q$nodes)*exp(q$nodes^2))
Here is a minimal worked-through example, based on a bivariate standard normal (probability) density with a given covariance matrix sigma and the quadrature rules implemented in the mvQuad library.
Note that we use the Gauss-Legendre quadrature rule, which allows integration over an arbitrary bounded domain because in the Gauss-Hermite quadrature the domain is unbounded from (-∞, +∞). Since we're working with a bivariate standard normal probability density, the integral over the unbounded domain trivially equals 1. Generally, mvQuad::createNIGrid allows the implementation of various quadrature rules, including the Gauss-Hermite quadrature (see ?createNIGrid for details).
Define a covariance matrix for the a bivariate standard normal probability density
library(mvtnorm)
sigma <- matrix(c(1, 0.2, 0.2, 1), ncol = 2)
dens <- function(x) dmvnorm(x, sigma = sigma)
We are interested in the integral of dens in the domain x ϵ [-1, 2] and y ϵ [-1, 2].
We follow instructions from the mvQuad vignette to create a grid and rescale to the domain of interest
library(mvQuad)
grid <- createNIGrid(dim = 2, type = "GLe", level = 6)
rescale(grid, domain = rbind(c(-1, 2), c(-1, 2)))
Calculate the integral of the bivariate normal in the domain x ϵ [-1, 2] and y ϵ [-1, 2]
quadrature(function(x) dmvnorm(x, sigma = sigma), grid = grid)
#[1] 0.6796583
This value is in good agreement with the value from pmvnorm (which computes the distribution function of the multivariate normal for arbitrary limits and covariance matrices)
pmvnorm(lower = c(-1, -1), upper = c(2, 2), sigma = sigma)
#[1] 0.6796584
#attr(,"error")
#[1] 1e-15
#attr(,"msg")
#[1] "Normal Completion"
In a coin flip, we would like to compute p(theta|Data), where theta is the underlying parameter.
The prior follows a beta distribution with parameters a and b.
The likelihood follows a Bernoulli distribution which gives us the probability of coming up heads.
Here is the code implementation:
a = 1 # a and b are the beta distribution's parameters
b= 1
num = 1e5 #Number of candidate theta values
z= 17220 #Number of heads
N= 143293 #Total number of flips
Theta = seq(0.07,0.12, length.out= num)
prior = dbeta(Theta, a,b) #Compute the prior at each value
likelihood = Theta^z *(1-Theta)^(N-z)
pData = likelihood * prior /sum(likelihood * prior) #Compute evidence
posterior = likelihood*prior / pData
I would like to verify that the posterior is equal to the analytical solutions beta(a+z, N-z+b). However, since the likelihood equals 0 because the theta values are small, the probability of the evidence is a Nan and so is the posterior.
I have tried computing the log likelihood but it gives me a large negative number which is equal to 0 when taking the exponential.
Theta = seq(0.07,0.12, by= num_steps)
lprior = log(dbeta(Theta, a,b)) #Compute the log prior at each value
llikelihood = log(Theta)*z + log(1-Theta)*(N-z) #log likelihood
lpData = llikelihood + lprior - sum(llikelihood + lprior) #compute evidence
lposterior = llikelihood+lprior - lpData
posterior = exp(lposterior)
plot(Theta, posterior, type="l")
lines(Theta, exp(llikelihood), type="l")
lines(Theta, exp(lprior), type="l")
If my ultimate goal is to have a nice graph that shows the posterior, likelihood and prior like so
How should I be computing each value?
This answer has been provided through the comment section by #JosephClarkMcIntyre.
Here is a summary:
In a Bernoulli trial, when N -the total number of trials- and z -the total number of success are large and the underlying parameter theta is small, it is better to only operate in the log space and never take the exponential.
Moreover, since the log function is increasing, comparing the log posteriors of two distributions is similar to comparing the posterior.
The above implementation was wrong because the formula for computing the evidence is not correct. p(evidence) = sum(likelihood*prior), p(log_evidence)= sum(log_likelihood +log_prior)
This is the final code, where the prior, likelihood and evidence are in the log space:
a = 1 # a and b are the beta distribution's paramteres
b= 1
num_steps = 1e5
z= 17220 #Number of heads
N= 143293 #Total number of flips
Theta = seq(from=0.07,to=0.12, length.out= num_steps)
lprior = dbeta(Theta, a,b,log=TRUE) #Compute the log prior at each value
llikelihood = log(Theta)*z + log1p(-Theta)*(N-z) #log likelihood
lpData = sum(llikelihood + lprior) #compute log of the evidence
lposterior = llikelihood+lprior - lpData
plot(Theta,log(dbeta(Theta,a+z,N-z+b)))
plot(Theta, lposterior, type="l")
However, the analytical and the computed log posterior are not the same as shown in the graph..
Feel free to comment if you think there is a flaw in this answer or explain why the analytical and computed log posterior are not the same. ^^
Is it possible to differentiate an ECDF? Take the one obtained in the following for example example.
set.seed(1)
a <- sort(rnorm(100))
b <- ecdf(a)
plot(b)
I would like to take the derivative of b in order to obtain its probability density function (PDF).
n <- length(a) ## `a` must be sorted in non-decreasing order already
plot(a, 1:n / n, type = "s") ## "staircase" plot; not "line" plot
However I'm looking to find the derivative of b
In samples-based statistics, estimated density (for a continuous random variable) is not obtained from ECDF by differentiation, because the sample size is finite and and ECDF is not differentiable. Instead, we estimate the density directly. I guess plot(density(a)) is what you are really looking for.
a few days later..
Warning: the following is just a numerical solution without statistical ground!
I take it as an exercise to learn about R package scam for shape constrained additive models, a child package of mgcv by Prof Wood's early PhD student Dr Pya.
The logic is as such:
using scam::scam, fit a monotonically increasing P-spline to ECDF (you have to specify how many knots you want); [Note that monotonicity is not the only theoretical constraint. It is required that the smoothed ECDF are "clipped" on its two edges: the left edge at 0 and the right edge at 1. I am currently using weights to impose such constraint, by giving very large weight at two edges]
using stats::splinefun, reparametrize the fitted spline with a monotonic interpolation spline through knots and predicted values at knots;
return the interpolation spline function, which can also evaluate the 1st, 2nd and 3rd derivatives.
Why I expect this to work:
As sample size grows,
ECDF converges to CDF;
P-spline is consistent so a smoothed ECDF will be increasingly unbiased for ECDF;
the 1st derivative of smoothed ECDF will be increasingly unbiased for PDF.
Use with caution:
You have to choose number of knots yourself;
the derivative is NOT normalized so that the area under the curve is 1;
the result can be rather unstable, and is only good for large sample size.
function arguments:
x: a vector of samples;
n.knots: number of knots;
n.cells: number of grid points when plotting derivative function
You need to install scam package from CRAN.
library(scam)
test <- function (x, n.knots, n.cells) {
## get ECDF
n <- length(x)
x <- sort(x)
y <- 1:n / n
dat <- data.frame(x = x, y = y) ## make sure `scam` can find `x` and `y`
## fit a monotonically increasing P-spline for ECDF
fit <- scam::scam(y ~ s(x, bs = "mpi", k = n.knots), data = dat,
weights = c(n, rep(1, n - 2), 10 * n))
## interior knots
xk <- with(fit$smooth[[1]], knots[4:(length(knots) - 3)])
## spline values at interior knots
yk <- predict(fit, newdata = data.frame(x = xk))
## reparametrization into a monotone interpolation spline
f <- stats::splinefun(xk, yk, "hyman")
par(mfrow = c(1, 2))
plot(x, y, pch = 19, col = "gray") ## ECDF
lines(x, f(x), type = "l") ## smoothed ECDF
title(paste0("number of knots: ", n.knots,
"\neffective degree of freedom: ", round(sum(fit$edf), 2)),
cex.main = 0.8)
xg <- seq(min(x), max(x), length = n.cells)
plot(xg, f(xg, 1), type = "l") ## density estimated by scam
lines(stats::density(x), col = 2) ## a proper density estimate by density
## return smooth ECDF function
f
}
## try large sample size
set.seed(1)
x <- rnorm(1000)
f <- test(x, n.knots = 20, n.cells = 100)
f is a function as returned by stats::splinefun (read ?splinefun).
A naive, similar solution is to do interpolation spline on ECDF without smoothing. But this is a very bad idea, as we have no consistency.
g <- splinefun(sort(x), 1:length(x) / length(x), method = "hyman")
curve(g(x, deriv = 1), from = -3, to = 3)
A reminder: it is highly recommended to use stats::density for a direct density estimation.
In GAM (and GLM, for that matter), we're fitting a conditional likelihood model. So after fitting the model, for a new input x and response y, I should be able to compute the predictive probability or density of a specific value of y given x. I might want to do this to compare the fit of various models on validation data, for example. Is there a convenient way to do this with a fitted GAM in mgcv? Otherwise, how do I figure out the exact form of the density that is used so I can plug in the parameters appropriately?
As a specific example, consider a negative binomial GAM :
## From ?negbin
library(mgcv)
set.seed(3)
n<-400
dat <- gamSim(1,n=n)
g <- exp(dat$f/5)
## negative binomial data...
dat$y <- rnbinom(g,size=3,mu=g)
## fit with theta estimation...
b <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=nb(),data=dat)
And now I want to compute the predictive probability of, say, y=7, given x=(.1,.2,.3,.4).
Yes. mgcv is doing (empirical) Bayesian estimation, so you can obtain predictive distribution. For your example, here is how.
# prediction on the link (with standard error)
l <- predict(b, newdata = data.frame(x0 = 0.1, x1 = 0.2, x2 = 0.3, x3 = 0.4), se.fit = TRUE)
# Under central limit theory in GLM theory, link value is normally distributed
# for negative binomial with `log` link, the response is log-normal
p.mu <- function (mu) dlnorm(mu, l[[1]], l[[2]])
# joint density of `y` and `mu`
p.y.mu <- function (y, mu) dnbinom(y, size = 3, mu = mu) * p.mu(mu)
# marginal probability (not density as negative binomial is discrete) of `y` (integrating out `mu`)
# I have carefully written this function so it can take vector input
p.y <- function (y) {
scalar.p.y <- function (scalar.y) integrate(p.y.mu, lower = 0, upper = Inf, y = scalar.y)[[1]]
sapply(y, scalar.p.y)
}
Now since you want probability of y = 7, conditional on specified new data, use
p.y(7)
# 0.07810065
In general, this approach by numerical integration is not easy. For example, if other link functions like sqrt() is used for negative binomial, the distribution of response is not that straightforward (though also not difficult to derive).
Now I offer a sampling based approach, or Monte Carlo approach. This is most similar to Bayesian procedure.
N <- 1000 # samples size
set.seed(0)
## draw N samples from posterior of `mu`
sample.mu <- b$family$linkinv(rnorm(N, l[[1]], l[[2]]))
## draw N samples from likelihood `Pr(y|mu)`
sample.y <- rnbinom(1000, size = 3, mu = sample.mu)
## Monte Carlo estimation for `Pr(y = 7)`
mean(sample.y == 7)
# 0.076
Remark 1
Note that as empirical Bayes, all above methods are conditional on estimated smoothing parameters. If you want something like a "full Bayes", set unconditional = TRUE in predict().
Remark 2
Perhaps some people are assuming the solution as simple as this:
mu <- predict(b, newdata = data.frame(x0 = 0.1, x1 = 0.2, x2 = 0.3, x3 = 0.4), type = "response")
dnbinom(7, size = 3, mu = mu)
Such result is conditional on regression coefficients (assumed fixed without uncertainty), thus mu becomes fixed and not random. This is not predictive distribution. Predictive distribution would integrate out uncertainty of model estimation.