I am very new to programming and have been essentially learning by trial and error, but have reached a problem I do not know how to approach. I need to do a double integration over a triangular area in R. As the usual integrate function doesn't seem able to handle this, I tried using cubature package (*edited - see below for the full code).
Update/Edit:
I've been working on this more and am still coming up against the same issue. I understand that I have to ensure that values are within the appropriate bounds with respect to the asin calculation. However, this still isn't getting around the fundamental problem of the triangular area. Perhaps it will be clearer if I post my full code below:
L <- 25
n <- -4
area <- 30
distances <- L*seq(0.005, 100, 0.05)
cond <- area*pi
d <- 5
fun <- function(x=1,r=0)
{
if (x<cond) {
return(0)
} else {
return((-1)*((n+2)/(2*pi*(L^2)))*(1+((x/L)^2))^(n/2)*(1/pi)*(1/pi)*acos(d/x))*asin(sqrt((pi*area)/d+r))
}
}
fun(5)
fun(300)
library(cubature)
integrationone <- function()
{
integrand <- adaptIntegrate(fun, lowerLimit=c(d,0), upperLimit=c(80,80))
return(integrand$integral)
}
integrationone()
warnings()
From looking at the warning messages, R seems unable to carry out the evaluation of the conditional argument while integrating over x, so I still can't get values for only the exact area I want to integrate over. Does anyone have any ideas or advice?
I don't think that the code behind adaptIntegrate will help you what's happen. You can type in a console adaptIntegrate and you will get the code. It is essentially a call to a C algorithm.
In order to understand what it is happen , I think you need before to understand what you integrate. Try to simplify your function, to see his definition domain.
INV_PI <- 1/pi
fun <- function(X){
scale <- -1*((n+2)/(2*pi*(L^2)))*INV_PI^2 *acos(d/(d+r))
res <- scale*asin(sqrt((pi*area)/X))* (1+((X/L)^2))^(n/2)
sqrt(prod(res))
}
Here the 2 terms on X , but only one can produce problem.
asin(sqrt((pi*area)/X))
asin is defined only between[-1,1], sqrt is defined only for positive numbers.
So here fun is defined between [pi*area,INF], and you have to integrate in this domain.
for example :
low.Lim <- pi*area
doubleintegration <- function()
{
integrand <- adaptIntegrate(fun, lowerLimit=c(low.Lim,low.Lim),
upperLimit=c(200*low.Lim,200*low.Lim))
return(integrand$integral)
}
doubleintegration()
[1] 0.1331089
Related
I try to solve nonlinear equations with controls.
Here is my code:
fun <- function(x) {
b0 <- (0.64*1+(1-0.64)*x[1]*(x[2]*x[1]-1)+x[1]*1*(1-x[2])*x[3])/(x[1]-1) -1805*2.85*0.64
b1plus <- (0.64*1+(1-0.64)*x[1]*(x[2]*x[1]-1.01)+x[1]*1.01*(1-x[2])*x[3])/(1.01*(x[1]-1)) -1805*2.85*0.64*(1+0.00235)
b1minus <- (0.64*1+(1-0.64)*x[1]*(x[2]*x[1]-0.99)+x[1]*0.99*(1-x[2])*x[3])/(0.99*(x[1]-1)) -1805*2.85*0.64*(1-0.00235)
return(c(b0,b1plus,b1minus))
}
multiroot(fun,c(1.5, 0, 0))
However, the result I get is far beyond the actual results. I wish to control x1 to the range (1.5,4), x2(0,1), x3(0,10000). How can I do that?
Thank you!!
Methods like 'Newton-Raphson' in multiroot or nleqslv do not work well together with bounds constraints. One possible approach is to square and sum the components of your function
fun1 <- function(x) sum(fun(x)^2)
and then treat this as a global optimization problem where you hope for a minimum value of 0.0. For example, GenSA provides an implementation of "simulated annealing" that works reasonably well in low dimensions.
library(GenSA)
res <- GenSA(par=NULL, fn=fun1,
lower=c(1.5,0,0), upper=c(4,1,10000),
control=list(maxit=10e5))
res$value; res$par
## [1] 119.7869
## [1] 4.00 0.00 2469.44
Several tries did not find a lower function value than this one, which makes me think there is no common root in the constraint box you requested.
I tried to solve the these non-linear equations by using nleqslv. However it does not work well. I do know the reason why it does not because I didn't separate the two unknowns to different sides of the equation.
My questions are: 1, Are there any other packages that could solve this kind of
equations?
2, Is there any effective way in R that could help me rearrange
the equation so that it meets the requirement of the package
nleqslv?
Thank you guys.
Here are the codes and p[1] and p[2] are the two unknowns I want to solve.
dslnex<-function(p){
p<-numeric(2)
0.015=sum(exp(Calib2$Median_Score*p[1]+p[2])*weight_pd_bad)
cum_dr<-0
for (i in 1:length(label)){
cum_dr[i]<-exp(Calib2$Median_Score*p[1]+p[2][1:i]*weight_pd_bad[1:i]/0.015
}
mid<-0
for (i in 1:length(label)){
mid[i]<-sum(cum_dr[1:i])/2
}
0.4=(sum(mid*weight_pd_bad)-0.5)/(0.5*(1-0.015))
}
pstart<-c(-0.000679354,-4.203065891)
z<- nleqslv(pstart, dslnex, jacobian=TRUE,control=list(btol=.01))
Following up on my comment I have rewritten your function as follows correcting errors and inefficiencies.
Errors and other changes are given as inline comments.
# no need to use dslnex as name for your function
# dslnex <- function(p){
# any valid name will do
f <- function(p) {
# do not do this
# you are overwriting p as passed by nleqslv
# p<-numeric(2)
# declare retun vector
y <- numeric(2)
y[1] <- 0.015 - (sum(exp(Calib2$Median_Score*p[1]+p[2])*weight_pd_bad))
# do not do this
# cum_dr is initialized as a scalar and will be made into a vector
# which will be grown as a new element is inserted (can be very inefficient)
# cum_dr<-0
# so declare cum_dr to be a vector with length(label) elements
cum_dr <- numeric(length(label))
for (i in 1:length(label)){
cum_dr[i]<-exp(Calib2$Median_Score*p[1]+p[2][1:i]*weight_pd_bad[1:i]/0.015
}
# same problem as above
# mid<-0
mid <- numeric(length(label))
for (i in 1:length(label)){
mid[i]<-sum(cum_dr[1:i])/2
}
y[2] <- 0.4 - (sum(mid*weight_pd_bad)-0.5)/(0.5*(1-0.015))
# return vector y
y
}
pstart <-c(-0.000679354,-4.203065891)
z <- nleqslv(pstart, dslnex, jacobian=TRUE,control=list(btol=.01))
nleqslv is intended for solving systems of equations of the form f(x) = 0 which must be square.
So a function must return a vector with the same size as the x-vector.
You should now be able to proceed provided your system of equations has a solution. And provided there are no further errors in your equations. I have my doubles about the [1:i] in the expression for cum_dr and the expression for mid[i]. The loop calculating mid possibly can be written as a single statement: mid <- cumsum(cum_dr)/2. Up to you.
I'm looking for a way to properly integrate my function:
lik = function(par, x){
cl = c()
for(i in 1:ncluster){
sub = c()
for(j in 1:nsub){
times = t[[i]][[j]]
m = c(1,t[[i]][j],t(cov[[i]][j,]))
repmat = cbind(1,1:t[[i]][j],matrix(rep(cov[[i]][j,], times),times, 3,byrow=T))
sub[j] = d[[i]][j]*m%*%c(par[-5],x)-sum(log((1+exp(repmat%*%c(par[-5],x)))))
}
cl[i] = sum(sub)
}
return(exp(cl))
}
function lik (which is likelihood) takes x, vector par of length 5, and yields a vector of likelihood at x at each cluster. For example,
> lik(1:5,1)
[1] 4.640101e-30 3.632315e-44 5.348611e-09 1.121790e-27 1.696704e-98
> #number of clusters=5
I want to integrate out x so that I can obtain the vector of marginalized pdf at each cluster, but function integrate or any other numerical integration packages are only capable of integrating scalar function. I've searched questions relating to this, and maybe Vectorization is the key to solving this problem, but I just do not know how.
I will really appreciate if you can give me any help. Thanks
Typically I recommend converting a function of 2 variables to a function of one variable prior to integrating as follows.
myfunc <-function(x,y){ stuff}
intfunc <-function(x){myfunc(x,y)}
integrate(intfunc,x, etc)
I am trying to write a function which takes a positive real number and keeps adding terms of the harmonic series until the total sum exceeds the initial argument.
I need my function to display the total number of terms of the series that were added.
Here's my code so far:
harmonic<-function(n){
x<-c(0,1)
while (length(x) < n) {
position <- length(x)
new <- 1/(x[position] + x[position-1])
x <- c(x,new)
}
return(x)
}
I apologise for the errors in my code, unfortunately I have been working with R only for a month and this is the first time that I am using the while loop and I couldn't find any useful information around.
Thank you, I'd really appreciate your help.
Here's an attempt based on some info from this post at maths.stackexchange: https://math.stackexchange.com/q/496116
I can't speak as to whether it is highly accurate in all circumstances or even the best or an appropriate way to go about this. Caveat emptor.
harmsum.cnt <- function(x,tol=1e-09) {
em.cons <- 0.577215664901533
difffun <- function(x,n) x - (log(n) + em.cons + 1/(2*n) - 1/(12*n^2))
ceiling(uniroot(difffun, c(1, 1e10), tol = tol, x = x)$root)
}
Seems to work alright though:
harmsum.cnt(7)
#[1] 616
harmsum.cnt(15)
#[1] 1835421
Compare:
tail(cumsum(1/1:616),1); tail(cumsum(1/1:615),1)
#7.001274
#6.999651
dput(tail(cumsum(1/1:1835421),1)); dput(tail(cumsum(1/1:1835420),1))
#15.0000003782678
#14.9999998334336
This is a partial answer, which I'll try to fill in later. On the assumption that you want an exact answer, rather than the excellent approximation formula thelatemail found, there are a few tools to consider.
First, use of a hash-table or memoise methods will allow you to save previous calculations, thus saving a lot of time.
Second, since the sum of a (finite) sequence is independent of the grouping, you can calculate, e.g. the first N terms and the second (N+1):2N terms independently. Use parallel package to divide and conquer.
Third, before you get too deep into the morass, check the limits of floating-point accuracy via a call to .Machine$double.eps Once your 1/n term comes close to that, you'll need to switch over to gmp and Rmpfr to get full accuracy in your calculations.
Now, just to clarify what you "should" be doing, a correct loop is
mylimit <- [pick a value]
harmsum<-0
for(k in 1:N){
harmsum <- harmsum + 1/k
if (harmsum >= mylimit) break
}
(or similar setup using while)
I'm using R to create a function, that amongst others uses Mills Ratio (See here). This is not a complicated formula, and at first I just programmed it like this:
mill <- function(x) {
return((1 - pnorm(x)) / dnorm(x))
}
I soon found out however, that for very large values (x >= 9) of x , this function returns zero. Even more dramatic, at around x >= 37, it starts returning NaN , which really messes up my stuff.
Following the article, for now I've changed the function into this:
mill <- function(x) {
if (x >= 9) {
return(1 / x)
} else {
return((1 - pnorm(x)) / dnorm(x))
}
}
This seems to work. I use this function to calculate a vector however, and when I use simulation to find the same vector, I get more or less the same answer, only it's always a bit off..
I think this has to do with my implementation of Mills Ratio, since the rest of the function is just exponentials, which R should have no trouble with.
I want to ask you guys if there is any way to solve this problem: to either implement this function better, or give me another way to find the Mills Ratio (perhaps through integration of some sorts, but wouldn't I run into the same issues there?). Thank you kindly for any help you can provide!
I would make two changes to your original mill function.
Change 1-pnorm(x) to pnorm(lower.tail=FALSE)
Use log's and take exponentials if needed.
So this gives
new_mill = function(x)
pnorm(x, lower.tail=FALSE, log.p=TRUE) - dnorm(x, log=TRUE)
So
R> exp(new_mill(10))
[1] 0.09903
R> exp(new_mill(40))
[1] 0.02498
Using a plot as a sanity check
x = seq(0, 10, 0.001)
plot(x, exp(new_mill(x)), type="l")
lines(x, mill(x), col=2)
gives