Integrate a function and specify other parameters in r - r

Let's say I want to integrate the function
integrand <- function(x, a) exp(a*x)
where x is my covariate and a is simply a number specified by the user. How do I integrate this in R? I know you can do something like:
integrand <- function(x) {1/((x+1)*sqrt(x))}
integrate(integrand, lower = 0, upper = Inf)
But how can I supply other parameters? I am thinking that I probably need some kind of expression or eval function, but I am not sure how to implement it.

try:
integrand <- function(x, a) exp(a*x)
integrate(integrand, lower = 0, upper = Inf, a=-1)
Have a look at ?integrate and you'll see
integrate(f, lower, upper, ..., subdivisions = 100L,
rel.tol = .Machine$double.eps^0.25, abs.tol = rel.tol,
stop.on.error = TRUE, keep.xy = FALSE, aux = NULL)
Arguments
f an R function taking a numeric first argument and returning a numeric vector of the same length. Returning a non-finite element will generate an error.
lower, upper the limits of integration. Can be infinite.
... additional arguments to be passed to f.
So you can pass as many arguments as integrand needs as fourth argument and they will be passed to the function before integrating it.

Related

Questions about the optimization in R

The following code was presented by our teacher in a practical lesson. I have questions about the last line of the code in the optimization step using the function optim() but I add the other lines to clarify the work.
population <- read.csv("data.csv", header=FALSE, sep=';', dec=',')
and the data looks like:
1 8.29
2 5.37
3 10.61
4 5.92
5 14.99
6 9.74
7 15.47
.
.
.
We sample 100 elements from the population as
Sampled_Data <- sample(population$V1, 100)
then we write a function to calculate the likelihood "of the log-normal distribution" for the sampled elements as
MyFunc <- function(Myparameters,data){
Firstparameter <- Myparameters[1]
Secondparameter <- Myparameters[2]
n <- length(data)
Mydistribution <- -n/2*log(2*pi*(Secondparameter^2)) - sum(log(data)) - (1/(2*Secondparameter^2))*sum((log(data)-Firstparameter)^2)
return(Mydistribution)
}
Finally, we use the function optim() to estimate the two parameters of the distribution using the likelihood maximum function
optimisation <- optim(c(1,1),MyFunc,data=Sampled_Data)
in the function optim(), I don't understand why he added the vector c(1,1) while, from the documentation, we should fill it by the initial values for the parameters? does he assume that the initial values are 1 & 1 ? if so, based on what we assume the initial values.
Also, why he added data=Sampled_Data whereas there is no similar thing in the documentation? From the documentation, after adding the function, we should add other things like the gradient, method and bounds! but not the data we have!
Finally, if I want to specify the lower and upper bounds, it is not clear for me which values to use in my case with the log normal distribution.
I was lost where to post the question, here or in cross validated but I saw similar questions here. Anyway, if it is not the suitable place I will delete the question.
Unnamed arguments to functions in R are assigned in the order that they appear in the definition.
If we look at help with help(optim):
Usage
optim(par, fn, gr = NULL, ...,
method = c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN",
"Brent"),
lower = -Inf, upper = Inf,
control = list(), hessian = FALSE)
We will see that the first argument by default is par. Therefore, you are correct that your instructor has set par = c(1,1) and fn = MyFun. You will see also in help(optim) that you can set upper = and lower = arguments as well.
If you look back to the definition of MyFun, you will see that the first argument is Myparameters. Therefore, when optim is called, c(1,1) is passed to MyFun as its first argument. Thus, on the initial step, both Firstparameter and Secondparameter will be set to 1.
Finally, if you look carefully at help(optim):
Usage
optim(par, fn, gr = NULL, ...,
method = c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN",
"Brent"),
lower = -Inf, upper = Inf,
control = list(), hessian = FALSE)
Arguments
par Initial values for the parameters to be optimized over.
fn A function to be minimized (or maximized), with first argument the vector of parameters over which minimization is to take place. It should return a scalar result.
gr A function to return the gradient for the "BFGS", "CG" and "L-BFGS-B" methods. If it is NULL, a finite-difference approximation will be used.
... Further arguments to be passed to fn and gr.
You will see that ... can be further arguments to be passed to fn and thus to MyFun. In this case, data=Sampled_Data.

Integral and numeric optimization (nlminb) R

I am having issues with an optimization problem involving numerical estimation of an integral which contains an unknown variable.
Numerical estimating an integral is simple enough, just use the integrate function in R. I am trying to estimate a rather unpleasant integral which requires optimization since it contains an unknown variable and a constraint. I am using the nlminb function but the result is highly incorrect. The idea is to evaluate the integral to constraint smaller or equal to 1-l, where l is between 0 and 1.
the code is the following:
integrand <- function(x, p) {dnorm(x,0,1)*(1-dnorm((qnorm(p)
-sqrt(0.12)*x)/(sqrt(1-0.12)), 0, 1))^800}
and it is the variable p which is unknown.
The objective function to be minimised is the following:
objective <- function(p){
PoD <- integrate(integrand, lower = -Inf, upper = Inf, p = p)$value
PoD - 0.5
}
test <- nlminb(0.015, objective = objective, lower = 0, upper = 1)$par*100
Edited to reflect mistakes in the objective function and the integral.
Same issue still remains.
I think my mistake is not specifying which variable to minimise. The optimisation just gives the starting value in nlminb multiplied by 100.
The authors of the paper used dummy variables and showed that a l = 0,5 should give p=0,15%.
Thank you for your time.
Of course, since your objective function does not depend on p. Do:
integrand <- function(x, p) {dnorm(x,0,1)*(1-dnorm((qnorm(p)
-sqrt(0.12)*x)/(sqrt(1-0.12)), 0, 1))^800}
objective <- function(p){
PoD <- integrate(integrand, lower = -Inf, upper = Inf, p = p)$value
PoD - 0.5
}

non-numeric argument to binary operator in integration in R

f1 = function(t){exp(-t^2)}
phi1 = function(theta){
0.5-1/sqrt(pi)*integrate(f1, lower = 0, upper = (10-theta)/(2*sqrt(2)))}
mu1 = (450*barx)/(454)
sigs1 = 36/454
pi_thetapost = function(theta){
(1/sqrt(2*pi*sigs1))*exp(-((theta-mu1)^2)/(2*sigs1))}
E_phipost = integrate(phi1*pi_thetapost, lower = -Inf, upper =Inf)
I was trying to do a integration, and I got the error says:
non-numeric argument to binary operator
I think that * is a binary operator, but I am not sure how to figure this out.
Thanks~
A few problems here. First barx isn't defined. But the main problem is that you can't just multiple functions in R. You can multiple the values returns by function, but not the functions themselves. You need to pass in a proper function to integrate.
But then after that problem, you need to make sure all the functions you pass in to integrate() are vectorized and your phi function is not. you need to be able to pass in a vector and get a vector out. The easiest way to fix this is with Vectorize(). And finally, integrate() returns an object, not just a number. So if you want to return the calculated value, you need to extract it from the object before you can multiply it by another object. Try
f1 <- function(t){
exp(-t^2)
}
phi1 <- function(theta) {
0.5-1/sqrt(pi)*integrate(f1, lower = 0, upper = (10-theta)/(2*sqrt(2)))$value
}
barx <- 2 # or whatever
mu1 <- (450*barx)/(454)
sigs1 <- 36/454
pi_thetapost <- function(theta){
(1/sqrt(2*pi*sigs1))*exp(-((theta-mu1)^2)/(2*sigs1))
}
myfun <- function(x) {
Vectorize(phi1)(x)*pi_thetapost(x)
}
E_phipost <- integrate(myfun, lower = -Inf, upper =Inf)
E_phipost
# 3.690467e-05 with absolute error < 5.6e-05
E_phipost$value
# [1] 3.690467e-05

Integrand Syntax in R

I'm trying to complete a straightforward integration, but I'm running into an issue that (I think) is due to the form in which I'm writing the integrand.
Suppose I want to find the area bound by f(x) = 3x and g(x) = x^2. Geometrically, the area between the two curves:
Ok, so not a big deal to do analytically:
But I'd like to accomplish this with R, of course.
So I enter my function and there's a problem:
> g <- function(x) {3x-x^2}
Error: unexpected symbol in "g <- function(x) {3x"
This frustrated me so I started playing around with things. Interestingly, I found that if I factor x out of the integrand:
everything works smoothly:
> f <- function(x) {x*(3-x)}
> integrate(f, 0, 3)
4.5 with absolute error < 5e-14
My next step was to check ?integrate, part of which is attached below:
integrate(f, lower, upper, ..., subdivisions = 100L,
rel.tol = .Machine$double.eps^0.25, abs.tol = rel.tol,
stop.on.error = TRUE, keep.xy = FALSE, aux = NULL)
Arguments
f
an R function taking a numeric first argument and returning a numeric vector of the same length. Returning a non-finite element will generate an error.
lower, upper
the limits of integration. Can be infinite.
Am I somehow not taking a numeric first argument in my first attempt to integrate? Thanks in advance.
Change 3x to 3*x.
(This may be the smallest answer-length-to-question-length ratio I've seen in a long time ;-)

Compute multiple Integral and plot them (with R)

I'm having trouble to compute and then plot multiple integral. It would be great if you could help me.
So I have this function
> f = function(x, mu = 30, s = 12){dnorm(x, mu, s)}
which i want to integrate multiple time between z(1:100) to +Inf to plot that with x=z and y = auc :
> auc = Integrate(f, z, Inf)
R return :
Warning message:
In if (is.finite(lower)) { :
the condition has length > 1 and only the first element will be used
I have tested to do a loop :
while(z < 100){
z = 1
auc = integrate(f,z,Inf)
z = z+1}
Doesn't work either ... don't know what to do
(I'm new to R , so I'm already sorry if it is really easy .. )
Thanks for your help :) !
There is no need to do the integrating by hand. pnorm gives the integral from negative infinity to the input for the normal density. You can get the upper tail instead by modifying the lower.tail parameter
z <- 1:100
y <- pnorm(z, mean = 30, sd = 12, lower.tail = FALSE)
plot(z, y)
If you're looking to integrate more complex functions then using integrate will be necessary - but if you're just looking to find probabilities for distributions then there will most likely be a function built in that does the integration for you directly.
Your problem is actually somewhat subtle, and in a certain sense gets to the core of how R works, so here is a slightly longer explanation.
R is a "vectorized" language, which means that just about everything works on vectors. If I have 2 vectors A and B, then A+B is the element-by-element sum of A and B. Nearly all R functions work this way also. If X is a vector, then Y <- exp(X) is also a vector, where each element of Y is the exponential of the corresponding element of X.
The function integrate(...) is one of the few functions in R that is not vectorized. So when you write:
f <- function(x, mu = 30, s = 12){dnorm(x, mu, s)}
auc <- integrate(f, z, Inf)
the integrate(...) function does not know what to do with z when it is a vector. So it takes the first element and complains. Hence the warning message.
There is a special function in R, Vectorize(...) that turns scalar functions into vectorized functions. You would use it this way:
f <- function(x, mu = 30, s = 12){dnorm(x, mu, s)}
auc <- Vectorize(function(z) integrate(f,z,Inf)$value)
z <- 1:100
plot(z,auc(z), type="l") # plot lines

Resources