R: compute an integral with an unknown parameter equal to a certain value (for example: int x = 0.6) - r

I try to simulate values out of an unknown integral (to create a climatological forecaster)
my function is: $\int_{x = 0}^{x = 0.25} 4*y^(-1/x) dx$
Normally one inputs the variable y and gets a value as output.
However, I want to input the value this integral is equal to and get the value of y as an output.
I have 3 runif vectors of length 1 000, 10 000 and 100 000 (with values between 0 and 1), which I use as my input values.
Say the first value is 0.3 and the second value is 0.78
I want to calculate for which y, the integral above is equal to 0.3 (or equal to 0.78 for the second value).
how am I able to do this in R?
I've tried some stuff with the integrate function, but then I need a value for y to make that work

You are trying to solve a non-linear equation with an integral inside.
Intuitively, what you need to do is to start with an interval in which the desired y sits on. Then try different values of y and calculate the integral, narrow the interval by the result.
You can implement that in R using integrate and optimize as below:
f <- function(x, y) {
4*y^(-1/x)
}
intf <- function(y) {
integrate(f, 0, 0.25, y=y)
}
objective <- function(y, value) {
abs(intf(y)$value - value)
}
optimize(objective, c(1, 10), value=0.3)
#$minimum
#[1] 1.14745
#
#$objective
#[1] 1.540169e-05
optimize(objective, c(1, 10), value=0.78)
#$minimum
#[1] 1.017891
#
#$objective
#[1] 0.0001655954
Here, f is the function to be integrated, intf calculates the integral for a given y, and objective measures the distance between the value of the integral against the desired value.
Since optimize function finds the minimum value of a function, it finds y such that the objective is closest to the target value.
Note that non-linear equations with an integral inside are in general tough to solve. This case seems manageable since the function is monotonic and continuous in y. The solution y should be unique and can be easily found by narrowing down the interval.

Related

Trying to plot loglikelihood of Cauchy distribution for different values of theta in R

I am trying to plot the log-likelihood function of the Cauchy distribution for varying values of theta (location parameter). These are my observations:
obs<-c(1.77,-0.23,2.76,3.80,3.47,56.75,-1.34,4.24,3.29,3.71,-2.40,4.53,-0.07,-1.05,-13.87,-2.53,-1.74,0.27,43.21)
Here is my log-likelihood function:
ll_c<-function(theta,x_values){
n<-length(x_values)
logl<- -n*log(pi)-sum(log(1+(x_values-theta)^2))
return(logl)
}
and Ive tried making a plot by using this code:
x<-seq(from=-10,to=10,by=0.1);length(x)
theta_null<-NULL
for (i in x){
theta_log<-ll_c(i,counts)
theta_null<-c(theta_null,theta_log)
}
plot(theta_null)
The graph does not look right and for some reason the length of x and theta_null differs.
I am assuming that theta is your location parameter (the scale is set to 1 in my example). You should obtain the same result using a t-distribution with 1 df and shifting the observations by theta. I left some comments in the code as guidance.
obs = c(1.77,-0.23,2.76,3.80,3.47,56.75,-1.34,4.24,3.29,3.71,-2.40,4.53,-0.07,-1.05,-13.87,-2.53,-1.74,0.27,43.21)
ll_c=function(theta, obs)
{
# Compute log-lik for obs and a value of thet (location)
logl= sum(dcauchy(obs, location = theta, scale = 1, log = T))
return(logl)
}
# Loop for possible values of theta(obs given)
x = seq(from=-10,to=10,by=0.1)
ll = NULL
for (i in x)
{
ll = c(ll, ll_c(i, obs))
}
# Plot log-lik vs possible value of theta
plot(x, ll)
It is hard to say exactly what you are experiencing without more info. But I'll make an educated guess.
First of all, we can simplify this a lot by using the *t family of functions for the t distribution, as the cauchy distribution is just the t distribution with df = 1. So your calculations could've been done using
for(i in ncp)
theta_null <- c(theta_null, sum(dt(values, 1, i, log = TRUE)))
Note that multiplying by n doesn't actually matter for any practical purposes. We are usually interested in minimizing/maximizing the likelihood in which case all constants are irrelevant.
Now if we use this approach, we can quite quickly notice something by printing the values:
print(head(theta_null))
[1] -Inf -Inf -Inf -Inf -Inf -Inf
So I am assuming what you are experiencing is that many of your values are "almost" negative infinity, and maybe these are not stored correctly in your outcome vector. I can't see that this should be the case from your code, but this would be my initial guess.

How to return possible pairs of variables based on a function?

I have a pair of variables (x, y) and for each variable there is a possible range of values (xmin, xmax and ymin, ymax). I am looking for such pairs that based on a function would yield the same probability.
This is my function that would return probabilities.
f <- function(x, y) 1-exp(-(x^(1/0.9)+y^(1/0.9))^0.9)
Now suppose I want to know that for a certain probability, say 0.01 what are the possible pairs of variables of x and y yielding that (considering their constraints, min and max values).
(What I have already tried is doing the whole thing the other way around by creating a matrix first for x and y and then for each combination I calculated the probability, but then I would need to find the same probabilities in the matrix, which seems to be even more difficult.)
So by doing some math (sorry latex formatting is not supported in SO):
P=1-exp(-(x^(1/0.9)+y^(1/0.9))^0.9)
ln(1-P)=-(x^(1/0.9)+y^(1/0.9))^0.9)
(-ln(1-P))^(1/0.9)-y^(1/0.9)=x^(1/0.9)
((-ln(1-P))^(1/0.9)-y^(1/0.9))^0.9=x
Now if we put it in some R code, and check when results do not exists :
get_x <- function(P,y)
{
x=((-log(1-P))^(1/0.9)-y^(1/0.9))^0.9
# Verification of the results
# If results no real (then x[i]=NaN) or if it does
# not match the given probability (should never happens)
# the results is set to NaN
# This verification is for debugging only, should be removed
for (i in c(1:length(y))){
if(is.na(x[i]) | abs(P-1+exp(-(x[i]^(1/0.9)+y[i]^(1/0.9))^0.9))>0.00001)
{
x[i]=NaN
print(paste0("Oops, something went wrong with y=",y[i]))
}
}
return(x)
}
y_values=seq(0.01,0.99,by=0.001)
get_x(0.09,y_values)
Which is pretty fast, now only one loop is used, instead of two to fill the matrix, so order n instead of n^2
We can calculate probability for all possible combinations and create a dataframe with combination which satisfies our criteria with some tolerance (for floating point comparison)
tol <- 0.0001
mat <- which((matrix2 >= 0.01 - tol) & (matrix2 <= 0.01 + tol), arr.ind = TRUE)
data.frame(comb1 = rownames(matrix2)[mat[, 1]], comb2 = colnames(matrix2)[mat[, 2]])

R: draw from a vector using custom probability function

Forgive me if this has been asked before (I feel it must have, but could not find precisely what I am looking for).
Have can I draw one element of a vector of whole numbers (from 1 through, say, 10) using a probability function that specifies different chances of the elements. If I want equal propabilities I use runif() to get a number between 1 and 10:
ceiling(runif(1,1,10))
How do I similarly sample from e.g. the exponential distribution to get a number between 1 and 10 (such that 1 is much more likely than 10), or a logistic probability function (if I want a sigmoid increasing probability from 1 through 10).
The only "solution" I can come up with is first to draw e6 numbers from the say sigmoid distribution and then scale min and max to 1 and 10 - but this looks clumpsy.
UPDATE:
This awkward solution (and I dont feel it very "correct") would go like this
#Draw enough from a distribution, here exponential
x <- rexp(1e3)
#Scale probs to e.g. 1-10
scaler <- function(vector, min, max){
(((vector - min(vector)) * (max - min))/(max(vector) - min(vector))) + min
}
x_scale <- scaler(x,1,10)
#And sample once (and round it)
round(sample(x_scale,1))
Are there not better solutions around ?
I believe sample() is what you are looking for, as #HubertL mentioned in the comments. You can specify an increasing function (e.g. logit()) and pass the vector you want to sample from v as an input. You can then use the output of that function as a vector of probabilities p. See the code below.
logit <- function(x) {
return(exp(x)/(exp(x)+1))
}
v <- c(seq(1,10,1))
p <- logit(seq(1,10,1))
sample(v, 1, prob = p, replace = TRUE)

range of values taken by f(x) based on a range of values for x

I would like to know the range of values that a function f(x) can take based on a range of values of x.
For instance, say I have a quadratic equation f(x)=x^2 - x + 0.2 and I want to know the range of f(x) for x in the range [0.2, 1].
is there a function or package in R that can do this?
If I correct understand your question you are looking for:
f <- function(x) x^2 - x + 0.2
x <- seq(0.2, 1, by=0.1)
range(f(x))
# [1] -0.05 0.20 # approximate numerical answer
If you want to know the range in an analytical way you have to do some mathematics (or further programming) to determine the maximum and minimum of the function f in that range of x.
An analytic answer can be calculated using calculus, if the function is differentiable. For the example quadratic, the calculation is:
f'(x) = 2x -1 = 0 => x* =1/2 is argmin/max, and lies within the domain for x: [0.2,1]
Evaluate f at the domain endpoints, and the argmin/max:
f(0.2) = 0.04, f(0.5) = -0.05, f(1) = 0.2.
So min = -0.05, max = 0.2.
A numerical approximation will work if the function is well-behaved (e.g. continuous, differentiable). Otherwise, a spike or discontinuity (e.g. f(x) = 1/x) could be missed depending on the step-size.

Finding the Maximum of a Function with numerical derivatives in R

I wish to numerically find the maximum of the function multiplied by Beta 3 shown on p346 of the following link when tau=30:
http://www.ssc.upenn.edu/~fdiebold/papers/paper49/Diebold-Li.pdf
They give the answer on p347 as 0.0609.
I would like to confirm this numerically in R. I.e. to take the derivative and find the value where it reaches zero.
library(numDeriv)
x <- 30
testh <- function(lambda){ ((1-exp(-lambda*30))/(lambda*30)) - exp(-lambda*30) }
grad_h <- function(lambda){
val <- grad(testh, lambda)
return(val^2)
}
OptLam <- optimize(f=grad_h, interval=c(0.0001,120), tol=0.0000000000001)
I take the square of the gradient as I want the minimum to be at zero.
Unfortunately, the answer comes back as Lambda=120!! With lambda at 120 the value of the objective function is 5.36e-12.
By working by hand I can func a lower value of the numerical derivative that is closer to zero (it is also close to the analytical value given above):
grad_h(0.05977604)
## [1] 4.24494e-12
Why is the function above not finding this lower value? I have set the tolerance very high so it should be able to find such this optimal value?
Is it possible to correct the existing method so that it gives the correct answer?
Is there a better way to find the maximum gradient of a function numerically in R?
For example is there an optimizer that looks for zero rather than trying to find a minimum of maximum?
You can use uniroot to find where the derivative is 0. This might work for you,
grad_h <- function(lambda){
val=grad(testh,lambda)
return(val)
}
## The root
res <- uniroot(grad_h, c(0,120), tol=1e-10)
## see it
ls <- seq(0.001, 1, length=1000)
plot(ls, testh(ls), col="salmon")
abline(v=res$root, col="steelblue", lwd=2, lty=2)
text(x=res$root, y=testh(res$root),
labels=sprintf("(%f, %s)", res$root,
format(testh(res$root), scientific = T)), adj=-0.1)

Resources