I would like to solve an equation as below, where the X is the only unknown variable and function f() is a multi-variate Student t distribution.
More precisely, I have a multi k-dimensional integral for a student density function, which gives us a probability as a result, and I know that this probability is given as q. The lower bound for all integral is -Inf and I know the last k-1 dimension's upper bound (as given), the only unknown variable is the first integral's upper bound. It should have an solution for a variable and one equation. I tried to solve it in R. I did Dynamic Conditional Correlation to have a correlation matrix in order to specify my t-distribution. So plug this correlation matrix into my multi t distribution "dmvt", and use the "adaptIntegral" function from "cubature" package to construct a function as an argument to the command "uniroot" to solve the upper bound on the first integral. But I have some difficulties to achieve what I want to get. (I hope my question is clear) I have provided my codes before, somebody told me that there is problem, but cannot find why there is an issue there. Many thanks in advance for your help.
I now how to deal with it with one dimension integral, but I don't know how a multi-dimension integral equation can be solved in R? (e.g. for 2 dimension case)
\int_{-\infty}^{X}
\int_{-\infty}^{Y_{1}} \cdots
\int_{-\infty}^{Y_{k}}
f(x,y_{1},\cdots y_{k})
d_{x}d_{y_{1},}\cdots d_{y_{k}} = q
This code fails:
require(cubature)
require(mvtnorm)
corr <- matrix(c(1,0.8,0.8,1),2,2)
f <- function(x){ dmvt(x,sigma=corr,df=3) }
g <- function(y) adaptIntegrate(f,
lowerLimit = c( -Inf, -Inf),
upperLimit = c(y, -0.1023071))$integral-0.0001
uniroot( g, c(-2, 2))
Since mvtnorm includes a pmvt function that computes the CDF of the multivariate t distribution, you don't need to do the integral by brute force. (mvtnorm also includes a quantile function qmvt, but only for "equicoordinate" values.)
So:
library(mvtnorm)
g <- function(y1_upr,y2_upr=-0.123071,target=1e-4,df=3) {
pmvt(upper=c(y1_upr,y2_upr),df=df)-target
}
uniroot(g,c(-10000,0))
## $root
## [1] -17.55139
##
## $f.root
## [1] -1.699876e-11
## attr(,"error")
## [1] 1e-15
## attr(,"msg")
## [1] "Normal Completion"
##
## $iter
## [1] 18
##
## $estim.prec
## [1] 6.103516e-05
##
Double-check:
pmvt(upper=c(-17.55139,-0.123071),df=3)
## [1] 1e-04
## attr(,"error")
## [1] 1e-15
## attr(,"msg")
## [1] "Normal Completion"
Related
I want to transform my excel solver model into a model in R. I need to find 3 sets of coordinates which minimizes the distance to the 5 other given coordinates. I've made a program which calculates a distance matrix which outputs the minimal distance from each input to the given coordinates. I want to minimize this function by changing the input. Id est, I want to find the coordinates such that the sum of minimal distances are minimized. I tried several methods to do so, see the code below (Yes my distance matrix function might be somewhat cluncky, but this is because I had to reduce the input to 1 variable in order to run some algorithms such as nloprt (would get warnings otherwise). I've also seen some other questions (such as GRG Non-Linear Least Squares (Optimization)) but they did not change/improve the solution.
# First half of p describes x coordinates, second half the y coordinates # yes thats cluncky
p<-c(2,4,6,5,3,2) # initial points
x_given <- c(2,2.5,4,4,5)
y_given <- c(9,5,7,1,2)
f <- function(Coordinates){
# Predining
Term_1 <- NULL
Term_2 <- NULL
x <- NULL
Distance <- NULL
min_prob <- NULL
l <- length(Coordinates)
l2 <- length(x_given)
half_length <- l/2
s <- l2*half_length
Distance_Matrix <- matrix(c(rep(1,s)), nrow=half_length)
# Creating the distance matrix
for (k in 1:half_length){
for (i in 1:l2){
Term_1[i] <- (Coordinates[k]-x_given[i])^2
Term_2[i] <- (Coordinates[k+half_length]-y_given[i])^2
Distance[i] <- sqrt(Term_1[i]+Term_2[i])
Distance_Matrix[k,i] <- Distance[i]
}
}
d <- Distance_Matrix
# Find the minimum in each row, thats what we want to obtain ánd minimize
for (l in 1:nrow(d)){
min_prob[l] <- min(d[l,])
}
som<-sum(min_prob)
return(som)
}
# Minimise
sol<-optim(p,f)
x<-sol$par[1:3]
y<-sol$par[4:6]
plot(x_given,y_given)
points(x,y,pch=19)
The solution however is clearly not that optimal. I've tried to use the nloptr function, but I'm not sure which algorithm to use. Which algorithm can I use or can I use/program another function which solves this problem? Thanks in advance (and sorry for the detailed long question)
Look at the output of optim. It reached the iteration limit and had not yet converged.
> optim(p, f)
$`par`
[1] 2.501441 5.002441 5.003209 5.001237 1.995857 2.000265
$value
[1] 0.009927249
$counts
function gradient
501 NA
$convergence
[1] 1
$message
NULL
Although the result is not that different you will need to increase the number of iterations to get convergence. If that is still unacceptable then try different starting values.
> optim(p, f, control = list(maxit = 1000))
$`par`
[1] 2.502806 4.999866 5.000000 5.003009 1.999112 2.000000
$value
[1] 0.005012449
$counts
function gradient
755 NA
$convergence
[1] 0
$message
NULL
I have five points for which I will fit a 2nd order polynomial in R. I will do so by minimizing the sum of squared errors of prediction (SSE).
What's the best way to do so?
So far I have done this:
(1,5.8), (2,3.9), (3,4.2), (4,5.7), (5,10.2) ## my data
To this data I want to fit a 2nd order polonium with the intercept 10 and the coefficient before x^2 is set to 1. I do this:
p<-c(5.8, 3.9, 4.2, 5.7, 10.2)
f<-function(x,a){x^2-ax+}
##sum of squared errors of prediction
SSE<-function(a){ (p[i]-f(i,a) )^2 }
nlm(SSE,0.1) ## 0.1 is picked totally random
But it returns a "wrong" answer:
$minimum
[1] 9.475923e-25
$estimate
[1] 4.96
When I manually calculate SSE for a=4.96 then is SSE=9.475923e-25. What do I do wrong? My biggest concern: does nlm function run through all my data points?
I'm not sure what you mean by "wrong". There are a few glitches in your code, and I get a slightly different answer (a almost exactly = 5).
p <- c(5.8, 3.9, 4.2, 5.7, 10.2)
x <- 1:5
f <- function(x,a){x^2-a*x+10}
##sum of squared errors of prediction
SSE <- function(a){ sum((p-f(x,a))^2) }
(n1 <- nlm(SSE,0.1)) ## 0.1 is picked totally random
## $minimum
## [1] 0.22
## $estimate
## [1] 4.999998
## $gradient
## [1] 1.443291e-10
## $code
## [1] 1
## $iterations
## [1] 3
This seems like a reasonable fit, although it doesn't match the data points exactly.
par(las=1,bty="l") ## cosmetic
plot(x,p,xlim=c(0,5.5))
xvec <- seq(0,6,length.out=100)
lines(xvec,f(xvec,n1$estimate))
You could also fit this model via
ypar <- p-x^2-10 ## subtract known terms from LHS
m1 <- lm(ypar~x-1) ## fit linear term (-1 suppresses intercept)
-1*coef(m1) ## flip sign
which gets a result of exactly 5.
Cans someone explain the results in a typical dt function? The help page says that I should receive the density function. However, in my code below, what does the first value ".2067" represent?The second value?
x<-seq(1,10)
dt(x, df=3)
[1] 0.2067483358 0.0675096607 0.0229720373 0.0091633611 0.0042193538 0.0021748674
[7] 0.0012233629 0.0007369065 0.0004688171 0.0003118082
Two things were confused here:
dt gives you the density, this is why it decreases for large numbers:
x<-seq(1,10)
dt(x, df=3)
[1] 0.2067483358 0.0675096607 0.0229720373 0.0091633611 0.0042193538 0.0021748674
[7] 0.0012233629 0.0007369065 0.0004688171 0.0003118082
pt gives the distribution function. This is the probability of being smaller or equal x.
This is why the values go to 1 as x increases:
pt(x, df=3)
[1] 0.8044989 0.9303370 0.9711656 0.9859958 0.9923038 0.9953636 0.9970069 0.9979617 0.9985521 0.9989358
A "probability density" is not really a true probability, since probabilities are bounded in [0,1] while densities are not. The integral of densities across their domain is normalized to exactly 1. So densities are really the first derivatives of the probability function. This code may help:
plot( x= seq(-10,10,length=100),
y=dt( seq(-10,10,length=100), df=3) )
The value of 0.207 for dt at x=1 says that at x=1 that the probability is increasing at a rate of 0.207 per unit increase in x. (And since the t-distribution is symmetric that is also the value of dt with 3 df at -1.)
A bit of coding to instantiate the dt(x,df=3) function (see ?dt) and then integrate it:
> dt3 <- function(x) { gamma((4)/2)/(sqrt(3*pi)*gamma(3/2))*(1+x^2/3)^-((3+1)/2) }
> dt3(1)
[1] 0.2067483
> integrate(dt3, -Inf, Inf)
1 with absolute error < 7.2e-08
Here's my setup
obs1<-c(1,1,1)
obs2<-c(0,1,2)
obs3<-c(0,0,3)
absoluteError<-function(obs,x){
return(sum(abs(obs-x)))
}
Example:
> absoluteError(obs2,1)
[1] 2
For a random vector of observations, I'd like to find a minimizer, x, which minimizes the absolute error between the observation values and a vector of all x. For instance, clearly the argument that minimizes absoluteError(obs1,x) is x=1 because this results in an error of 0. How do I find a minimizer for a random vector of observations? I'd imagine this is a linear programming problem, but I've never implemented one in R before.
The median of obs is a minimizer for the absolute error. The following is a sketch of how one might try proving this:
Let the median of a set of n observations, obs, be m. Call the absolute error between obs and m f(obs,m).
Case n is odd:
Consider f(obs,m+delta) where delta is some non zero number. Suppose delta is positive - then there are (n-1)/2 +1 observations whose error is delta more than f(obs,m). The remaining (n-1)/2 observations' error is at most delta less than f(obs,m). So f(obs,m+delta)-f(obs,m)>=delta. (The same argument can be made if delta is negative.) So the median is the only minimizer in this case. Thus f(obs,m+delta)>f(obs,m) for any non zero delta so m is a minimizer for f.
Case n is even:
Basically the same logic as above, except in this case any number between the two inner most numbers in the set will be a minimizer.
I am not sure this answer is correct, and even if it is I am not sure this is what you want. Nevertheless, I am taking a stab at it.
I think you are talking about 'Least absolute deviations', a form of regression that differs from 'Least Squares'.
If so, I found this R code for solving Least absolute deviations regression:
fabs=function(beta0,x,y){
b0=beta0[1]
b1=beta0[2]
n=length(x)
llh=0
for(i in 1:n){
r2=(y[i]-b0-b1*x[i])
llh=llh + abs(r2)
}
llh
}
g=optim(c(1,1),fabs,x=x,y=y)
I found the code here:
http://www.stat.colostate.edu/~meyer/hw12ans.pdf
Assuming you are talking about Least absolute deviations, you might not be interested in the above code if you want a solution in R from scratch rather than a solution that uses optim.
The above code is for a regression line with an intercept and one slope. I modified the code as follows to handle a regression with just an intercept:
y <- c(1,1,1)
x <- 1:length(y)
fabs=function(beta0,x,y){
b0=beta0[1]
b1=0
n=length(x)
llh=0
for(i in 1:n){
r2=(y[i]-b0-b1*x[i])
llh=llh + abs(r2)
}
llh
}
# The commands to get the estimator
g = optim(c(1),fabs,x=x,y=y, method='Brent', lower = (min(y)-5), upper = (max(y)+5))
g
I was not familiar with (i.e., had not heard of) Least absolute deviations until tonight. So, hopefully my modifications are fairly reasonable.
With y <- c(1,1,1) the parameter estimate is 1 (which I think you said is the correct answer):
$par
[1] 1
$value
[1] 1.332268e-15
$counts
function gradient
NA NA
$convergence
[1] 0
$message
NULL
With y <- c(0,1,2) the parameter estimate is 1:
$par
[1] 1
$value
[1] 2
$counts
function gradient
NA NA
$convergence
[1] 0
$message
NULL
With y <- c(0,0,3) the parameter estimate is 0 (which you said is the correct answer):
$par
[1] 8.613159e-10
$value
[1] 3
$counts
function gradient
NA NA
$convergence
[1] 0
$message
NULL
If you want R code from scratch, there is additional R code in the file at the link above which might be helpful.
Alternatively, perhaps it might be possible to extract the relevant code from the source file.
Alternatively, perhaps someone else can provide the desired code (and correct any errors on my part) in the next 24 hours.
If you come up with code from scratch please post it as an answer as I would love to see it myself.
lad=function(x,y){
SAD = function(beta, x, y) {
return(sum(abs(y - (beta[1] + beta[2] * x))))
}
d=lm(y~x)
ans1 = optim(par=c(d$coefficients[1], d$coefficients[2]),method = "Nelder-Mead",fn=SAD, x=x, y=y)
coe=setNames(ans1$par,c("(Intercept)",substitute(x)))
fitted=setNames(ans1$par[1]+ans1$par[2]*x,c(1:length(x)))
res=setNames(y-fitted,c(1:length(x)))
results = list(coefficients=coe, fitted.values=fitted, residuals=res)
class(results)="lad"
return(results)
}
I am trying to get the value of x that would minimize my equation y. I would like to use R.
The equation is:
y= [(a-bx)^2] / {[2bx /(1+x)]+c}
where a, b, c are all constant, but different to one another.
Thanks.
The standard optimize function should be sufficient for simple one-dimensional minimization:
a <- 2
b <- 1
c <-1
func <- function(x){(a-b*x)^2/((2*b*x/(1+x))+c)}
optimize(f=func, interval = c(-3,3))
$minimum
[1] -0.3333377
$objective
[1] -277201.4