I'm trying to find the single value of scale in function bayesmeta::qhalfnormal such that the first and the second elements of the vector low_high <- c(.1, 1) have .025 and .975 probability of happening in it, respectively.
In other words, for what value of scale .1 can have .025 and 1 can have .975 probability.
So, I have one parameter (scale) to optimize, and expect a single value for it. I'm using optim below but this way, I get two values for scale.
Is there a better optimization function to give me a single value for scale?
library(bayesmeta)
low_high <- c(.1, 1)
alpha <- c(.025, .975)
f <- function(x) {
low_high - qhalfnormal(alpha, scale = x) }
optim(low_high, function(x)sum(f(x)^2))
# $par
> [1] 3.1939758 0.4461607 # I expect a single value for `scale`
# But it seems `optim()` has acted like `Vectorize(optimize)` looping over
# elements of `low_high` vector.
#anonymous.asker is correct that passing a vector of length 2 is confusing optim(). What is happening is that qhalfnormal() is vectorizing both over the quantiles and the vector of scale values you gave it: e.g. qhalfnormal(c(0.1, 1), c(0.025, 0.975)) returns a two-element vector comprising (1) the 0.025 quantile for a scale parameter of 0.1 and (2) the 0.975 quantile for a scale parameter of 1. These then get collapsed to a single output value by the sum-of-squares operation ... (what you wanted, I think, was to evaluate qhalfnormal() for both quantile levels for a single scale parameter).
If you specify a single value that is close enough to the true value you get an answer, and a warning suggesting that you not use Nelder-Mead:
optim(0.45, function(x)sum(f(x)^2))
If your starting value is too far from the solution you get a warning and an error as soon as the algorithm tries a negative value for the parameter ("scale > 0 is not TRUE").
The sensible way to do this is to specify method="Brent" (as suggested by the warning, at which point you also need to specify bounds:
optim(1, function(x)sum(f(x)^2), method="Brent", lower=0, upper=10)
This returns 0.4461; this is indeed the argmin (parameter corresponding to the minimum value) for this problem. As #Onyambu points out in comments though, it doesn't really solve the larger problem (which is to try to reduce both values to 0); it solves the problem as posed, which is to minimize the objective function ...
You are passing a vector of length 2 as the initial estimate. If you want to set bounds for your variable, that's under different arguments in optim.
Related
I used the following function to estimate the three-parameter Weibull distribution.
library(bbmle)
library(FAdist)
set.seed(16)
xl=rweibull3(50, shape = 1,scale=1, thres = 0)
dweib3l <- function(shape, scale, thres) {
-sum(dweibull3(xl , shape, scale, thres, log=TRUE))
}
ml <- mle2(dweib3l, start= list(shape = 1, scale = 1, thres=0), data=list(xl))
However, when I run the above function I am getting the following error.
Error in optim(par = c(shape = 1, scale = 1, thres = 0), fn = function (p) :
non-finite finite-difference value [3]
In addition: There were 16 warnings (use warnings() to see them)
Is there any way to overcome this issue?
Thank you!
The problem is that the threshold parameter is special: it defines a sharp boundary to the distribution, so any value of thres above the minimum value of the data will give zero likelihoods (-Inf negative log-likelihoods): if a given value of xl is less than the specified threshold, then it's impossible according to the statistical model you have defined. Furthermore, we know already that the maximum likelihood value of the threshold is equal to the minimum value in the data set (analogous results hold for MLE estimation of the bounds of a uniform distribution ...)
I don't know why the other questions on SO that address this question don't encounter this particular problem - it may be because they use a starting value of the threshold that's far enough below the minimum value in the data set ...
Below, I use a fixed value of min(xl)-1e-5 for the threshold (shifting the value downward avoids numerical problems when the value is exactly on the boundary). I also use the formula interface so we can call the dweibull3() function directly, and put lower bounds on the shape and scale parameters (as a result I need to use method="L-BFGS-B", which allows for constraints).
ml <- mle2(xl ~ dweibull3(shape=shape, scale = scale,
thres=min(xl)-1e-5),
start=list(shape=1, scale=1),
lower=c(0,0),
method="L-BFGS-B",
data=data.frame(xl))
(The formula interface is convenient for simple examples: if you want to do something very much more complicated you may want to go back to defining your own log-likelihood function explicitly.)
If you insist on fitting the threshold parameter, you can do it by setting an upper bound that is (nearly) equal to the minimum value that occurs in the data [any larger value will give NA values and thus break the optimization]. However, you will find that the estimate of the threshold parameter always converges to this upper bound ... so this approach is really getting to the previous answer the hard way (you'll also get warnings about parameters being on the boundary, and about not being able to invert the Hessian).
eps <- 1e-8
ml3 <- mle2(xl ~ dweibull3(shape=shape, scale = scale, thres = thres),
start=list(shape=1, scale=1, thres=-5),
lower=c(shape=0,scale=0,thres=-Inf),
upper=c(shape=Inf,scale=Inf,thres=min(xl)-eps),
method="L-BFGS-B",
data=data.frame(xl))
For what it's worth it does seem to be possible to fit the model without fixing the threshold parameter, if you start with a small value and use Nelder-Mead optimization: however, it seems to give unreliable results.
This may be some basic/fundamental question on 'dnorm' function in R. Let's say I create some z scores through z transformation and try to get the sum out of 'dnorm'.
data=c(232323,4444,22,2220929,22323,13)
z=(data-mean(data))/sd(data)
result=dnorm(z,0,1)
sum(result)
[1] 1.879131
As above, the sum of 'dnorm' is not 1 nor 0.
Then let's say I use zero mean and one standard deviation even in my z transformation.
data=c(232323,4444,22,2220929,22323,13)
z=(data-0)/1
result=dnorm(z,0,1)
sum(result)
[1] 7.998828e-38
I still do not get either 0 or 1 in sum.
If my purpose is to get sum of the probability equal to one as I will need for my further usage, what method do you recommend using 'dnorm' or even using other PDF functions?
dnorm returns the values evaluated in the normal probability density function. It does not return probabilities. What is your reasoning that the sum of your transformed data evaluated in the density function should equate to one or zero? You're creating a random variable, there is no reason it should ever equate exactly zero or one.
Integrating dnorm yields a probability. Integrating dnorm over the entire support of the random variable yields a probability of one:
integrate(dnorm, -Inf, Inf)
#1 with absolute error < 9.4e-05
In fact, integrate(dnorm, -Inf, x) conceptually equals pnorm(x) for all x.
Edit: In light of your comment.
The same applies to other continuous probability distributions (PDFs):
integrate(dexp, 0, Inf, rate = 57)
1 with absolute error < 1.3e-05
Note that the ... argument(s) from ?integrate is passed to the integrand.
Recall also that the Poisson distribution, say, is a discrete probability distribution and hence integrating it (in the conventional sense) makes no sense. A discrete probability distribution have a probability mass function (PMF) and not a PDF which actually return probabilities. In that case, it should sum to one.
Consider:
dpois(0.5, lambda = 2)
#[1] 0
#Warning message:
#In dpois(0.5, lambda = 2) : non-integer x = 0.500000
Summing from 0 to a 'very' large number (i.e. over the support of the Poisson distribution):
sum(dpois(0:1000000, lambda = 2))
#[1] 1
I would like to use optimize(), or something similar, to search for a minimum / maximum value of a function. However I am unsure of about the exact range over which the function should be optimized, which is a required parameter for the function 'optimze()' (e.g. optimize(f=FUN,interval=c(lowerBound,upperBound))).
In this optimization problem, I am able to estimate a value a that is "close" to the optimal solution, but "closeness" depends on the situation.
Is there a function in R that can use the initial value a that does not require that the interval over which the function is optimized to be specified up front?
When you say you're not sure about the lower limit, I suspect that this means that the parameter you are trying to estimate is not bounded below.
If this the case, one trick is to transform the function so that there is a lower bound on the parameter.
This trivial function has a minimum at x=4:
fun <- function(x) -exp(-(x - 4)^2) + 8
which we can find via:
optimize(f=fun,interval=c(0,8))
#> $minimum
#> [1] 4
but let's pretend for a moment that we're not sure if there is a lower limit or not, and that we know that the upper limit is 8. R will throw an error if we try:
optimize(f=fun,interval=c(-Inf,8))
because the bounds must be finite. In this case, we can use the exponential transformation (exp()) which maps
the real numbers to the positive numbers, like so:
optimize(f=function(x)fun(log(x)),
interval=exp(c(-Inf,8)))
#> $minimum
#> [1] 54.59815
and then to get the root, you just need to back transform the above the solution via:
log(54.59815)
#> 4
If you don't know either the upper or lower bound on the underlying parameter, then you can use the log-odds transformation in place of the log():
function(x) log(x/(1-x))
and it's inverse in place of exp():
function(y) exp(y)/(1 + exp(y))
Note that the log-odds transformation maps the real numbers onto the unit interval, so the interval parameter becomes 0:1.
These solutions do have some numerical limitations (e.g. if we had set interval=exp(c(-Inf,16)) in the first solution, we would have gotten an error). Tip, you can re-scale these transformations to center around a given point a which can reduce the numerical limitations.
My understanding of lchoose function in R is simply lchoose(a,b) = log(choose(a,b)).
However, I found that:
temp <- 7.9999993
k <- 8
choose(temp,k)
[1] 0
lchoose(temp,k)
[1] 0
log(choose(temp,k))
[1] -Inf
So lchoose is not log of the choose function output.
Why is this happening?
In the discrete case (i.e discrete n), choose(n,k) computes the number of distinct k-element subsets from a set of n elements, so if k > n, then you are counting subsets of a set which have more elements that the corresponding set. Since there are no such subsets, then the answer is zero.
In general, for an n which is a real number, the function can still be computed, but however, the function still has to have the same meaning over discrete values, so for k>n the function has a value of zero. If you look at the definition of the binomial function with real n (see here) you'll see that the answer will be zero, but I tried to explain it, hopefully, in an intuitive manner.
I am trying to determine the weights of 9 metrics which will return the highest accuracy ratio. Since they are weights, the values need to sum to 1 and lie between 0 & 1. I am currently using the optim function, but do to constraints, I think I need to switch to constrOptim. I was wondering the best way to do this. Below I have included the code i am currently using. x.matrix is 20,000 by 9 matrix of values ranked between 1-10.
pars<-c(w1=(1/9),w2=(1/9),w3=(1/9),w4=(1/9),w5=(1/9),w6=(1/9),w7=(1/9),w8=(1/9),w9=(1/9))
OptPars<-function(pars){(-(rcorr.cens(x.matrix%*%pars),f)["Dxy"])}
opt<-optim(pars,OptPars)
Say you have values x on the range (-Inf, Inf) and you need values p in the range [0,1] that sum to 1, you can do the following transformation
p <- exp(x)/sum(exp(x))
If you do that translation in your optimization function and do the same transformation on the best set of parameters, you should get what you want.