I am using cuhre from R2Cuba 1.1-0 for integrating the following function
fn <- function(x) {
pnorm((-2-sum(sqrt(vecRho)*x))/sqrt(1-sum(vecRho)))*prod(dnorm(x))
}
where vecRho is a vector of 6 numbers between 0 and 0.1, i.e.
vecRho<-runif(6,0,0.1)
By definition, the integrand fn is between 0 and 1. The integration is expected to be positive. However, using cuhre the result becomes negative when the length of vecRho exceeds 5.
NDIM<-length(vecRho)
cuhre(NDIM, 1, fn,
flags = list(verbose =0),
lower = rep(-10,NDIM),
upper = rep(10,NDIM))$value
[1] -0.4738284
Moreover, when the length of vecRho >=6 the absolute value of the integration increases as the length of vecRho increases.
Is there something I can do to fix this? Thanks!
ok, got it, you have 6D integral with 6 gaussian kernels inside. I know good answer for 1D integration with gaussian kernel. It is called Gauss-Hermite quadrature, and there is an R package for that. If you go this route, you'll have to
make curry functions, but it might worth it.
Some sample code:
library(gaussquad)
n.quad <- 128 # integration order
# get the particular (weights,abscissas) as data frame
# with 2 observables and n.quad observations
rule <- ghermite.h.quadrature.rules(n.quad, mu = 0.0)[[n.quad]]
# test function - integrate 1 over exp(-x^2) from -Inf to Inf
# should get sqrt(pi) as an answer
f <- function(x) {
1.0
}
q <- ghermite.h.quadrature(f, rule)
print(q - sqrt(pi))
Related
Let g(x) = 1/(2*pi) exp ( - x^2 / 2) be the density of the normal distribution with mean 0 and standard deviation 1. In some calculation on paper appeared integrals of the form
where c>0 is a positive number.
Since I could not evaluate this by hand, I had the idea to approximate and plot it. I tried this in R, because R provides the dnorm function and a function to do integrals.
You see that I need to integrate numerically n times, where n shall be chosed by the call of a plot function. My code has an for-loop to create those "incomplete" convolutions iterativly.
For example even with n=3 and c=1 this gives me an error. n=2 (thus it's one integration) works.
N = 3
ngauss <- function(x) dnorm(x , mean = 0, sd = 1)
convoluts <- list()
convoluts[[1]] <- ngauss
for (i in 2:N) {
h <- function(y) {
g <- function(z) {ngauss(y-z)*convoluts[[i-1]](z)}
return(integrate(g, lower = -1, upper = 1)$value)
}
h <- Vectorize(h)
convoluts[[i]] <- h
}
convoluts[[3]](0)
What I get is:
Error: evaluation nested too deeply: infinite recursion /
options(expressions=)?
I understand that this is a hard computation, but for "small" n something similar should possible.
Maybe someone can help me to fix my code or provide a recommendation how I can implement this in a better way. Another language that is more appropriate for this would be also okay.
The issue appears to be in how integrate deals with variables in different environments. In particular, it doesn't really deal with i correctly in each iteration. Instead using
h <- evalq(function(y) {
g <- function(z) {ngauss(y - z) * convoluts[[i - 1]](z)}
integrate(g, lower = -1, upper = 1)$value
}, list(i = i))
does the job and, say, setting N <- 6 quickly gives
convoluts[[N]](0)
# [1] 0.03423872
As your integration is simply the pdf of a sum of N independent standard normals (which then follows N(0, N)), we may also verify this approach by setting lower = -Inf and upper = Inf. Then with N <- 4 we have
dnorm(0, sd = sqrt(N))
# [1] 0.1994711
convoluts[[N]](0)
# [1] 0.1994711
So, for practical purposes, when c = Inf, you are way better off using dnorm rather than manual computations.
I'm trying to tackle a nonlinear optimization problem where the objective functions are non-linear and constraints are linear. I read a bit on the ROI package in R and I decided to use the same. However, I am facing a problem while solving the optimization problem.
I am essentially trying to minimize the area under a supply-demand curve. The equation for the supply and demand curves are defined in the code:
Objective function: minimize (Integral of supply curve + integral of demand curve),
subject to constraints q greater than or equal to 34155 (stored in a variable called ICR),
q greater than or equal to 0
and q less than or equal to 40000.
I have tried to run this through the ROI package in RStudio and I keep getting an error telling me that there is no solver to be found.
library(tidyverse)
library(ROI)
library(rSymPy)
library(mosaicCalc)
# Initializing parameters for demand curve
A1 <- 6190735.2198302800
B1 <- -1222739.9618776600
C1 <- 103427.9556133250
D1 <- -4857.0627045073
E1 <- 136.7660814828
# Initializing parameters for Supply Curve
S1 <- -1.152
S2 <- 0.002
S3 <- a-9.037e-09
S4 <- 2.082e-13
S5 <- -1.64e-18
ICR <- 34155
demand_curve_integral <- antiD(A1 + B1*q + C1*(q^2)+ D1*(q^3) + E1*(q^4) ~q)
supply_curve_integral <- antiD(S1 + S2*(q) + S3*(q^2) + S4*(q^3) + S5*(q^4)~q)
# Setting up the objective function
obj_func <- function(q){ (18.081*demand_curve_integral(q))+supply_curve_integral(q)}
# Setting up the optimization Problem
lp <- OP(objective = F_objective(obj_func, n=1L),
constraints=L_constraint(L=matrix(c(1, 1, 1), nrow=3),
dir=c(">=", ">=", "<="),
rhs=c(ICR, 0, 40000, 1))),
maximum = FALSE)
sol <- ROI_solve(lp)
This is the error that I keep getting in RStudio:
Error in ROI_solve(lp) : no solver found for this signature:
objective: F
constraints: L
bounds: V
cones: X
maximum: FALSE
C: TRUE
I: FALSE
B: FALSE
What should I do to rectify this error?
In general you could use ROI.plugin.alabama or ROI.plugin.nloptr for this optimization problem.
But I looked at the problem and this raised several questions.
a is not defined in the code.
You state that q has length 1 and add 3 linear constraints the constraints say
q >= 34155, q >= 0, q <= 40000 or q <= 1
I am not entirely sure since the length of rhs is 4 but L and dir
suggest there are only 3 linear constraints.
How should the constraint look like?
34155 <= q <= 40000?
Then you could specify the constraint as bounds and use ROI.plugin.optimx
or since you have a one dimensional optimization problem just use optimize
from the stats package https://stat.ethz.ch/R-manual/R-devel/library/stats/html/optimize.html.
I haven't run NLP using ROI. But you have to install an ROI solver plug-in and then load the library in your code. The current solver plug-ins are:
library(ROI.plugin.glpk)
library(ROI.plugin.lpsolve)
library(ROI.plugin.neos)
library(ROI.plugin.symphony)
library(ROI.plugin.cplex)
Neos provides access to NLP solvers but I don't know how to pass solver parameters via an ROI plug-in function call.
https://neos-guide.org/content/nonlinear-programming
I wish to integrate (1/y)*(2/(1+(log(y))^2)) from 0 to 1. Wolfram alpha tells me this should be pi. But when I do monte carlo integration in R, I keep getting 3.00 and 2.99 after trying over 10 times. This is what I have done:
y=runif(10^6)
f=(1/y)*(2/(1+(log(y))^2))
mean(f)
I copied the exact function into wolfram alpha to check that the integral should be pi
I tried to check if my y is properly distributed by checking it's mean and plotting a historgram, and it seems to be ok. Could there be something wrong with my computer?
Edit: Maybe someone else could copy my code and run it themselves, to confirm that it isn't my computer acting up.
Ok, first let's start with simple transformation, log(x) -> x, making integral
I = S 2/(1+x^2) dx, x in [0...infinity]
where S is integration sign.
So function 1/(1+x^2) is falling monotonically and reasonable fast. We need some reasonable PDF to sample points in [0...infinity] interval, such that most of the region where original function is significant is covered. We will use exponential distribution with some free parameter which we will use to optimize sampling.
I = S 2/(1+x^2)*exp(k*x)/k k*exp(-k*x) dx, x in [0...infinity]
So, we have k*e-kx as properly normalized PDF in the range of [0...infinity]. Function to integrate is (2/(1+x^2))*exp(k*x)/k. We know that sampling from exponential is basically -log(U(0,1)), so code to do that is very simple
k <- 0.05
# exponential distribution sampling from uniform vector
Fx <- function(x) {
-log(x) / k
}
# integrand
Fy <- function(x) {
( 2.0 / (1.0 + x*x) )*exp(k*x) / k
}
set.seed(12345)
n <- 10^6L
s <- runif(n)
# one could use rexp() as well instead of Fx
# x <- rexp(n, k)
x <- Fx(s)
f <- Fy(x)
q <- mean(f)
print(q)
Result is equal to 3.145954, for seed 22345 result is equal to 3.135632, for seed 32345 result is equal to 3.146081.
UPDATE
Going back to original function [0...1] is quite simple
UPDATE II
changed per prof.Bolker suggestion
y <- matrix(c(7, 9, -5, 0, 2, 6), ncol = 1)
try <- t(y)
tryy <- try %*% y
i <- solve(tryy)
h <- y %*% i %*% try
uniroot(as.vector(solve(((1-x) * diag(6)) + h)), c(-Inf, Inf))
Error in (1 - x) * diag(6) : non-conformable arrays
The purpose of this command uniroot(as.vector(solve(((1-x) * diag(6)) + h)), c(-Inf, Inf)) is to solve the characteristics equation det[(1-λ)I+h] = 0
where, λ=eigenvalues , I=identity matrix , h=hat matrix=y(y'y)^(-1)y'
here λ is unknown ,we have to solve for it.
I am not understanding where is the problem here? I have tried as:
as.vector(solve(6*diag(6)+h))
This is not non-conformable. But why is not working inside the uniroot function?
Your question is a bit confusing, so I have to make a couple of assumptions. If you want the eigenvalues of h, then the characteristic equation is:
det(h - I*λ) = 0
not
det[(1-λ)I+h] = 0
So I used the former.
Given the above, the short answer is: do it this way.
f <- function(lambda) det(h -lambda*diag(6))
F <- Vectorize(f)
library(rootSolve)
uniroot.all(F,c(-1000,1000),n=2000)
# [1] 0 1
# or, much more simply
eigen(h)$values
# [1] 1.000000e+00 2.220446e-16 0.000000e+00 -2.731318e-18 -6.876381e-18 -7.365903e-17
So h has 2 eigenvalues, 0 and 1. Note that the built-in function eigen(...) finds 6 roots, but 5 of them are within the machine tolerance of 0.
The question about why your code fails is a bit more involved.
First, your code:
tryy <- try %*% y
is the dot product of y with itself (so, a scalar), returned as a matrix with one element. When you "invert" that using solve(...)
i <- solve(tryy)
you simply take the reciprocal, so i is also a matrix with 1 element. I'm not sure if this is what you had in mind.
Second, uniroot(...) does not work this way. The first argument must be a function; you've passed an expression which depends on x, which in turn is undefined. You could try:
f <- function(x) det(h-x*diag(6))
uniroot(f,c(-Inf,Inf))
but this wouldn't work either because (a) uniroot(...) works on a finite interval, (b) it requires that the function f(...) have different sign at the ends of the interval, and (c) in any event it would return only one root (the smaller one).
So you could use uniroot.all(...) in package rootSolve. uniroot.all(...) also requires a function as it's first argument, but there's a twist: the function must be "vectorized". This means that if you pass a vector of lambda values, f(...) should return a vector of the same length. Fortunately in R there is an easy way to "vectorize" a given function, as in:
F <- Vectorize(f).
Even this has it's limits. uniroot.all(...) also requires a finite interval, so we have to guess what that is, and also it evaluates F on n sub-intervals. So if your interval does not contain all the roots, or if the sub-intervals are not small enough, you will not find all the roots.
Using the built-in eigen(...) function is definitely the best option.
I have been using the Excel solver to handle the following problem
solve for a b and c in the equation:
y = a*b*c*x/((1 - c*x)(1 - c*x + b*c*x))
subject to the constraints
0 < a < 100
0 < b < 100
0 < c < 100
f(x[1]) < 10
f(x[2]) > 20
f(x[3]) < 40
where I have about 10 (x,y) value pairs. I minimize the sum of abs(y - f(x)). And I can constrain both the coefficients and the range of values for the result of my function at each x.
I tried nls (without trying to impose the constraints) and while Excel provided estimates for almost any starting values I cared to provide, nls almost never returned an answer.
I switched to using optim, but I'm having trouble applying the constraints.
This is where I have gotten so far-
best = function(p,x,y){sum(abs(y - p[1]*p[2]*p[3]*x/((1 - p[3]*x)*(1 - p[3]*x + p[2]*p[3]*x))))}
p = c(1,1,1)
x = c(.1,.5,.9)
y = c(5,26,35)
optim(p,best,x=x,y=y)
I did this to add the first set of constraints-
optim(p,best,x=x,y=y,method="L-BFGS-B",lower=c(0,0,0),upper=c(100,100,100))
I get the error ""ERROR: ABNORMAL_TERMINATION_IN_LNSRCH"
and end up with a higher value of the error ($value). So it seems like I am doing something wrong. I couldn't figure out how to apply my other set of constraints at all.
Could someone provide me a basic idea how to solve this problem that a non-statistician can understand? I looked at a lot of posts and looked in a few R books. The R books stopped at the simplest use of optim.
The absolute value introduces a singularity:
you may want to use a square instead,
especially for gradient-based methods (such as L-BFGS).
The denominator of your function can be zero.
The fact that the parameters appear in products
and that you allow them to be (arbitrarily close to) zero
can also cause problems.
You can try with other optimizers
(complete list on the optimization task view),
until you find one for which the optimization converges.
x0 <- c(.1,.5,.9)
y0 <- c(5,26,35)
p <- c(1,1,1)
lower <- 0*p
upper <- 100 + lower
f <- function(p,x=x0,y=y0) sum(
(
y - p[1]*p[2]*p[3]*x / ( (1 - p[3]*x)*(1 - p[3]*x + p[2]*p[3]*x) )
)^2
)
library(dfoptim)
nmkb(p, f, lower=lower, upper=upper) # Converges
library(Rvmmin)
Rvmmin(p, f, lower=lower, upper=upper) # Does not converge
library(DEoptim)
DEoptim(f, lower, upper) # Does not converge
library(NMOF)
PSopt(f, list(min=lower, max=upper))[c("xbest", "OFvalue")] # Does not really converge
DEopt(f, list(min=lower, max=upper))[c("xbest", "OFvalue")] # Does not really converge
library(minqa)
bobyqa(p, f, lower, upper) # Does not really converge
As a last resort, you can always use a grid search.
library(NMOF)
r <- gridSearch( f,
lapply(seq_along(p), function(i) seq(lower[i],upper[i],length=200))
)