Coding a multiple integral function in R - r

With the goal of turning the following into a function, I was wondering how I can write the following double integral in terms of R codes?: ($\bar{x} = \mu$):

Assuming pi0 and pi1 implement your functions $\pi_0$ and $\pi_1$ in a vectorized way, a possible solution is:
integral <- function(n, mu, s, pi0, pi1) {
C <- (2 * pi)^(-n/2)
C * integrate(f = function(sigmavec) sapply(sigmavec, function(sigma) {
integrate(f = function(delta) {
exp(-n/2 * ((mu / sigma - delta)^2 + (s / sigma)^2)) * pi1(delta)
}, lower = -Inf, upper = Inf)$value
}) * pi0(sigmavec) / (sigmavec^n), lower = 0, upper = Inf)$value
}
# Tests
integral(n = 1, mu = 0, s = 1, pi0 = dnorm, pi1 = dnorm)
# [1] 0.0473819
integral(n = 1, mu = 0, s = 1, pi0 = function(sigma) 1/sigma, pi1 = dcauchy)
# [1] 0.2615783

Note sure if this question is on topic, but I am open to answer.
May be you should ask a more general question, how to write/computing integral
using computer program (code)? There at least are two ways
Using numerical integration, such as Monte Carlo method
Using symbolic toolbox to solve the problem analytically and plugin the numerical value.
Examples on $\int_0^1 x^2$
f<-function(x){
x^2
}
curve(f,0,1)
# method 1
integrate(f,lower=0,upper = 1)
# method 2
library(Ryacas)
x <- Sym("x")
f <- function(x) {
x^2
}
f2=yacas(yacas(Integrate(f(x), x)))
f2
x <- 1
Eval(f2)

Related

Integrating under a curve in R

I apologise if this is a duplicate; I've read answers to similar questions to no avail.
I'm trying to integrate under a curve, given a specific formula (below) for said integration.
As a toy example, here's some data:
Antia_Model <- function(t,y,p1){
r <- p1[1]; k <- p1[2]; p <- p1[3]; o <- p1[4]
P <- y[1]; I <- y[2]
dP = r*P - k*P*I
dI = p*I*(P/(P + o))
list(c(dP,dI))
}
r <- 0.25; k <- 0.01; p <- 1; o <- 1000 # Note that r can range btw 0.1 and 10 in this model
parms <- c(r, k, p, o)
P0 <- 1; I0 <- 1
N0 <- c(P0, I0)
TT <- seq(0.1, 50, 0.1)
results <- lsoda(N0, TT, Antia_Model, parms, verbose = FALSE)
P <- results[,2]; I <- results[,3]
As I understand it, I should be able to use the auc() function from the MESS package (can I just use the integrate() function? Unclear...), which should look something like this:
auc(P, TT, from = x1, to = x2, type = "spline")
Though I don't really understand how to use the "from" and "to" arguments, or how to incorporate "u" from the original integration formula...
Using the integrate() function seems more intuitive, but if I try:
u <- 1
integrand <- function(P) {u*P}
q <- integrate(integrand, lower = 0, upper = Inf)
I get this error:
# Error in integrate(integrand, lower = 0, upper = Inf) :
# the integral is probably divergent
As you can tell, I'm pretty lost, so any help would be greatly appreciated! Thank you so much! :)
integrand is technically acceptable but right now, it's the identity function f(x) = x. The area under it from [0, inf) is infinite, i.e. divergent.
From the documentation of integrate the first argument is:
an R function taking a numeric first argument and returning a numeric vector of the same length. Returning a non-finite element will generate an error.
If instead you use a pulse function:
pulse <- function(x) {ifelse(x < 5 & x >= 0, 1, 0)}
integrate(pulse, lower = 0, upper = Inf)
#> 5 with absolute error < 8.5e-05

Is there a simple way to calculate the maximum likelihood estimate of a parameter in R?

I am trying to calculate the MLE of a poisson distribution in R. Is there a function in R that allows us to do this (eg. I know Stata has a mlexp function that allows us to make this calculation quite easily). I see that there is a mlexp function in the univariateML package for the exponential distribution. That being said, is there a command that allows this for more than just exponential distributions?
The fitdistrplus package might help with what you want, though it can't handle more complicated distributions. The mle() function of the stats4
package might also help. I believe the bbmle package is more general than the other options, but I haven't really used it myself.
Here's a function I've used for custom distributions. Similar to stats4::mle. It does assume that the PDF argument is named x (as with the R implementaion of all the classic distributions).
mle <- function(data, fun, init, logarg = TRUE, lower = -Inf, upper = Inf) {
if (logarg) {
fnll <- function(...) {
-sum(do.call(fun, l <- c(x = list(data), log = TRUE, as.list(...))))
}
} else {
fnll <- function(...) {
return(-sum(log(do.call(fun, c(x = list(data), as.list(...))))))
}
}
params <- optim(init, fnll, method = "L-BFGS-B", lower = lower, upper = upper)$par
names(params) <- formalArgs(fun)[-1][seq_along(init)]
return(params)
}
# optim will check the boundaries
# set lim0 and lim1 for help in setting bounds
lim0 <- .Machine$double.eps
lim1 <- 1 + lim0
# Poisson distribution
data <- rpois(1e5, 14)
rbind(trueML = c(lambda = mean(data)),
mle = mle(data, dpois, 1, lower = lim0))
#> lambda
#> trueML 13.99387
#> mle 13.99387
# normal distribution
data <- rnorm(1e5, -2, 3)
rbind(trueML = c(mean = mean(data), sd = sd(data)*sqrt((length(data) - 1)/length(data))),
mle = mle(data, dnorm, 0:1, lower = c(-Inf, lim0)))
#> mean sd
#> trueML -2.002658 2.993441
#> mle -2.002657 2.993442
# gamma distribution
data <- rgamma(1e5, 0.5, 0.1)
c(mle = mle(data, dgamma,
init = c(mean(data)^2/var(data), mean(data)/var(data)),
lower = rep(lim0, 2)))
#> mle.shape mle.rate
#> 0.5007139 0.1003400
# triangular distribution
dtri <- function(x, a, b, c) {
if (a > b) {a <- (b - a) + (b <- a)}
if (b > c) {c <- (b - c) + (b <- c)}
blna <- x < b
p <- numeric(length(x))
p[blna] <- 2*(x[blna] - a)/(c - a)/(b - a)
p[!blna] <- 2*(c - x[!blna])/(c - a)/(c - b)
return(p)
}
rtri <- function(n, a, b, c) {
if (a > b) {a <- (b - a) + (b <- a)}
if (b > c) {c <- (b - c) + (b <- c)}
fb <- (b - a)/(c - a)
U <- runif(n)
blna <- U < fb
r <-numeric(n)
r[blna] <- a + sqrt(U[blna]*(c - a)*(b - a))
r[!blna] <- c - sqrt((1 - U[!blna])*(c - a)*(c - b))
return(r)
}
data <- rtri(1e5, -6, -3, 3)
mind <- min(data); maxd <- max(data)
# set logarg to FALSE because dtri doesn't have a log argument
c(mle = mle(data, dtri, logarg = FALSE,
init = c(mind - abs(mind)*lim0, median(data), maxd + abs(maxd)*lim0),
lower = c(-Inf, min(data), maxd + abs(maxd)*lim0),
upper = c(mind - abs(mind)*lim0, max(data), Inf)))
#> mle.a mle.b mle.c
#> -6.003666 -3.000262 2.994473
Created on 2021-11-04 by the reprex package (v2.0.1)

how can I set up this equation for constrained maximization?

How i can write this equation inside R as a function?
subject to: 20* x1 + 170*x2 = 20000
#ATTEMPT
library(Rsolnp)
fn <- function(h, s){
z=200 * x[1]^(2/3) * x[2]^(1/3)
return(-z)}
# constraint z1: 20*x+170*y=20000
eqn <- function(x) {
z1=20*x[1] + 170*x[2]
return(c(z1))
}
constraints = c(20000)
x0 <- c(1, 1) # setup init values
sol1 <- solnp(x0, fun = fn, eqfun = eqn, eqB = constraints)
sol1$pars
In R, we would use the keyword function, and we would pass the necessary parameters:
for example in this case.
R <- function(h, s)200 * h^(2/3) * s^(1/3)
We now have a function called R, that takes in arguments h and s and gives us an output.
For example, we could do:
R(27, 8)

Integration in R language

I'm trying to compute the integral between 1 and some cutoff 'cut' of the function given in the R-code below as 'int'. It depends on 2 parameters dM[i] and dLambda[j] defined before I make the integration and for each pair I save the results in vector 'vec':
vec = c() #vector for INT values: this is our goal
dM = seq(from = 0, to = 3, by = 0.01) #vector for mass density parameter
dLambda = seq(from = -1.5, to = 3, by = 0.01) #vector for vacuum energy density parameter
for (i in 1:length(dM)) {
for (j in 1:length(dLambda)) {
int = function(x) ((dM[i]*x^4*(x - 1) + dLambda[j]*x^2*(1 - x^2) + x^4)^(-1/2))
cut = 30
INT_data = integrate(int, 1, cut)
INT = INT_data$value
vec = c(vec, INT)
}
}
But when I run the script I get the error: "Error in integrate(int, 1, cut) : non-finite function value
". Nonetheless, if I tried the following code
int = function(x) ((0*x^4*(x - 1) -1.5*x^2*(1 - x^2) + x^4)^(-1/2))
cut = 30
INT_data = integrate(int, 1, cut)
INT = INT_data$value
vec = c(vec, INT)
I get the correct result without any error. So the error above is not true, it can calculate the integral but it seems that R cannot work it out if I use the 2 'for'-loops. How can I re-write the code so I can compute all the different values for dM[i] and dLambda[j] I want?
Your function is only defined for some values of dM and dLambda. You can use the try() function to attempt evaluation, but not stop in case an error occurs.
It's also a lot more efficient to pre-allocate the object to hold the results; running vec = c(vec, INT) gradually grows it, and that's very slow, because R needs to keep creating new vectors just one element longer than the last one.
This code fixes both issues, and then plots the result:
dM <- seq(from = 0, to = 3, by = 0.01) #vector for mass density parameter
dLambda <- seq(from = -1.5, to = 3, by = 0.01) #vector for vacuum energy density parameter
result <- matrix(NA, length(dM), length(dLambda))
for (i in 1:length(dM)) {
for (j in 1:length(dLambda)) {
int <- function(x) ((dM[i]*x^4*(x - 1) + dLambda[j]*x^2*(1 - x^2) + x^4)^(-1/2))
cut <- 30
INT_data <- try(integrate(int, 1, cut), silent = TRUE)
if (!inherits(INT_data, "try-error"))
result[i, j] <- INT_data$value
}
}
image(dM, dLambda, result)
Edited to add: Here's how this works. If integrate signals an error in your original code, the loop will stop. try() prevents that. If there's no error, it returns the result of the integrate call. If there is an error, it returns an object with information about the error. That object has class "try-error", so the check if (!inherits(INT_data, "try-error")) is basically asking "Was there an error?" If there was an error, nothing happens, and that entry of the result is left as NA, as it was initialized. The loop then goes on to try the next dM, dLambda pair.
The problem is mathematical rather than being related to coding. The function is not defined for the whole domain you are integrating. With dM[1] = 0 and dLambda > 1, your expression
(dM[i]*x^4*(x - 1) + dLambda[j]*x^2*(1 - x^2) + x^4)^(-1/2)
simplifies to
(dLambda[j] * x^2 * (1 - x^2) + x^4)^(-1/2)
so let's take dLambda[j] at 1.01, which is where your calculation stops:
(1.01 * x^2 * (1 - x^2) + x^4)^(-1/2)
which is
(1.01 * x^2 - 1.01 * x^4 + x^4)^(-1/2)
or
(1.01 * x^2 - 0.01 x^4)^(-1/2)
Now, you are evaluating x between 1 and 30. So what happens when x = 11?
(1.01 * 121 - 0.01 * 14641)^(-1/2)
This leaves you
(122.21 - 146.41)^(-1/2)
which is equivalent to
1/sqrt(-24.2)
So the reason for the error is that you are integrating a function in a domain in which it is undefined.
The function is badly behaved for other values of dM too, with infinite peaks in the midst of the range, so even using the integrate(..., stop.on.error = F) option won't allow you to keep calculating because you will get an infinite sum.

Error in using optim to maximise the likelihood in r

So, I have these functions:
funk1 <- function(a,x,l,r) {
x^2*exp(-(l*(1-exp(-r*a))/r))}
funk2 <- function(x,l,r) {
sapply(x, function (s) {
integrate(funk1, lower = 0, upper = s, x=s, l=l, r=r)$value })}
which are used to explain the data y in,
z <- data.frame(ts = 1:100,
y = funk2(1:100, l = 1, r = 1) + rpois(100, 1:100))
I wish to use optim to maximise the likelihood, so I defined a likelihood function:
LL_funk <- function(l,r) {
n=nrow(z)
R = sum((funk2(ts,l,r) - y)^2)
logl = -((n/2)*log(R))
return(-logl)
}
and I tried to fit using optim
fit <- optim(par=c(0.5,0.5), fn= LL_funk, method="Nelder-Mead")
But I get an error:
Error in integrate(funk1, lower = 0, upper = s, x = s, l = l, r = r) :
a limit is missing
I am not sure why? I could run nls fitting funk2(x,l,r) to y
nls(y ~ funk2(ts,l,r), data = z, start = list(l = 0.5, r = 0.5))
That means funk2 is working. I guess its the problem with LL function that I have designed, which I cant figure out!! Please Help!
Yup! There were two problems with your function. This worked for me:
LL_funk <- function(params) {
n=nrow(z)
l = params[1]
r = params[2]
R = sum((funk2(z$ts,l,r) - z$y)^2)
logl = -((n/2)*log(R))
return(-logl)
}
Previous issues:
LL_funk only takes 1 argument, which is the vector of parameters.
In LHS of the assignment of R, ts and y were not actually referring to columns in your dataset.

Resources