Computation of numerical integral involving convolution - r

I have to solve the following convolution related numerical integration problem in R or perhaps computer algebra system like Maxima.
Integral[({k(y)-l(y)}^2)dy]
where
k(.) is the pdf of a standard normal distribution
l(y)=integral[k(z)*k(z+y)dz] (standard convolution)
z and y are scalars
The domain of y is -inf to +inf.
The integral in function l(.) is an indefinite integral. Do I need to add any additional assumption on z to obtain this?
Thank you.

Here is a symbolic solution from Mathematica:

R does not do symbolic integration, just numerical integration. There is the Ryacas package which intefaces with Yacas, a symbolic math program that may help.
See the distr package for possible help with the convolution parts (it will do the convolutions, I just don't know if the result will be integrable symbolicly).
You can numerically integrate the convolutions from distr using the integrate function, but all the parameters need to be specified as numbers not variables.

For the record, here is the same problem solved with Maxima 5.26.0.
(%i2) k(u):=exp(-(1/2)*u^2)/sqrt(2*%pi) $
(%i3) integrate (k(x) * k(y + x), x, minf, inf);
(%o3) %e^-(y^2/4)/(2*sqrt(%pi))
(%i4) l(y) := ''%;
(%o4) l(y):=%e^-(y^2/4)/(2*sqrt(%pi))
(%i5) integrate ((k(y) - l(y))^2, y, minf, inf);
(%o5) ((sqrt(2)+2)*sqrt(3)-2^(5/2))/(4*sqrt(3)*sqrt(%pi))
(%i6) float (%);
(%o6) .02090706601281356
Sorry for the late reply. Leaving this here in case someone finds it by searching.

I try to do something similar in matlab, where I convolute two random (Rayleigh distributed) variables. The result of fz_fun is equal to fy_fun, I don't know why. Maybe some here knows it?
sigma1 = 0.45;
sigma2 = 0.29;
fx_fun =#(x) [0*x(x<0) , (x(x>=0)./sigma1^2).*exp(-0.5*(x(x>=0)./sigma1).^2)];
fy_fun =#(y) [0*y(y<0) , (y(y>=0)./sigma2^2).*exp(-0.5*(y(y>=0)./sigma2).^2)];
% Rayleigh distribution of random var X,Y:
step = 0.1;
x= -2:step:3;
y= -2:step:3;
%% Convolution:
z= y;
fz = zeros(size(y));
for i = 1:length(y)
fz_fun(i) = integral(#(z) fy_fun(y(i)).*fx_fun(z-y(i)),0,Inf); % probability density of random variable z= x+y
end

Related

Is there any way to linearize x-x^2<=0?

I am trying to solve an optimization problem.
The objective function and all constraints of this problem are linear except x-x^2<=0.
Is there any way to linearize x-x^2<=0, where x is a continuous variable?
Note that x is not in the objective function.
The usual approach is to convert the problem to an iterative, non-linear one where you solve for increments:
f(x) = x - x^2
df/dx = 1 -2x
Make an initial guess x0; take a step for dx; solve for df; calculate x1 = x0 + dx and f1 = f0 + df and iterate until convergence.
You might look into optimization with constraints. Read up on Lagrange multipliers.

Linear regression and matrix division in Julia

The well known formula for OLS is (X'X)^(-1)X'y where X is nxK and y is nx1.
One way to implement this in Julia is (X'*X)\X'*y.
But I found that X\y gives the almost same output up to the tiny computational error.
Do they always compute the same thing (as long as n>k)? If so, which one should I use?
When X is square, there is a unique solution and LU-factorization (with pivoting) is a numerically-stable way to calculate this. That is the algorithm that backslash uses in this case.
When X is not square, which is the case in most regression problems, then there is no unique solution but there is a unique least square solution. The QR factorization method for solving Xβ = y is a numerically stable method for generating the least square solution, and in this case X\y uses the QR-factorization and thus gives the OLS solution.
Notice the words numerically stable. While (X'*X)\X'*y will theoretically always give the same result as backslash, in practice backslash (with the correct factorization choice) will be more precise. This is because the factorization algorithms are implemented to be numerically stable. Because of the change for floating point errors to accumulate when doing (X'*X)\X'*y, it's not recommended that you use this form for any real numerical work.
Instead, (X'*X)\X'*y is somewhat equivalent to an SVD factorization which is the most nuemrically stable algorithm, but also the most expensive (in fact, it's basically writing out the Moore-Penrose pseudoinverse which is how an SVD factorization is used to solve a linear system). To directly do an SVD factorization using a pivoted SVD, do svdfact(X) \ y on v0.6 or svd(X) \ y on v0.7. Doing this directly is more stable than (X'*X)\X'*y. Note that qrfact(X) \ y or qr(X) \ y (v0.7) is for QR. See the factorizations portion of the documentation for more details on all of the choices.
Following the documentation the result of X\y is (there notation \(A, B) is used not X and y):
For rectangular A the result is the minimum-norm least squares solution
This is your case I guess as you assume n>k (so your matrix is not square). So you can safely use X\y. Actually it is better to use it than the standard formula as you will get a result even if rank of X is less than min(n,k), whereas standard formula (X'*X)^(-1)*X'*y will fail or produce numerically unstable result if X'*X is nearly singular.
If X would be square (this is not your case) then we have a bit different rule in the documentation:
For input matrices A and B, the result X is such that A*X == B when A is square
This means that the \ algorithm would produce an error if your matrix were singular or produce numerically unstable results if the matrix were nearly singular (in practice most often lu function that is called internally for general dense matrices may throw SingularException).
If you want a catch-all solution (for square and non square matrices) then qr(X, Val(true)) \ y can be used.
Short answer: No, use the first one (the well-known one).
Long answer:
The linear regression model is Xβ = y, and it's easily to derive β = X \ y, which is your second method. However, in most time (when X is not invertible), this is wrong, since you cannot simply left multiply X^-1. The correct way is to solve β = argmin{‖y - Xβ‖^2} instead, which leads to the first method.
To show they are not always the same, simple construct a case where X is not invertible:
julia> X = rand(10, 10)
10×10 Array{Float64,2}:
0.938995 0.32773 0.740556 0.300323 0.98479 0.48808 0.748006 0.798089 0.864154 0.869864
0.973832 0.99791 0.271083 0.841392 0.743448 0.0951434 0.0144092 0.785267 0.690008 0.494994
0.356408 0.312696 0.543927 0.951817 0.720187 0.434455 0.684884 0.72397 0.855516 0.120853
0.849494 0.989129 0.165215 0.76009 0.0206378 0.259737 0.967129 0.733793 0.798215 0.252723
0.364955 0.466796 0.227699 0.662857 0.259522 0.288773 0.691278 0.421251 0.593215 0.542583
0.126439 0.574307 0.577152 0.664301 0.60941 0.742335 0.459951 0.516649 0.732796 0.990509
0.430213 0.763126 0.737171 0.433884 0.85549 0.163837 0.997908 0.586575 0.257428 0.33239
0.28398 0.162054 0.481452 0.903363 0.780502 0.994575 0.131594 0.191499 0.702596 0.0967979
0.42463 0.142 0.705176 0.0481886 0.728082 0.709598 0.630134 0.139151 0.423227 0.942262
0.197805 0.526095 0.562136 0.648896 0.805806 0.168869 0.200355 0.557305 0.69514 0.227137
julia> y = rand(10, 1)
10×1 Array{Float64,2}:
0.7751785556478308
0.24185992335144801
0.5681904264574333
0.9134364924569847
0.20167825754443536
0.5776727022413637
0.05289808385359085
0.5841180308242171
0.2862768657856478
0.45152080383822746
julia> ((X' * X) ^ -1) * X' * y
10×1 Array{Float64,2}:
-0.3768345891121706
0.5900885565174501
-0.6326640292669291
-1.3922334538787071
0.06182039005215956
1.0342060710792016
0.045791973670925995
0.7237081408801955
1.4256831037950832
-0.6750765481219443
julia> X \ y
10×1 Array{Float64,2}:
-0.37683458911228906
0.5900885565176254
-0.6326640292676649
-1.3922334538790346
0.061820390052523294
1.0342060710793235
0.0457919736711274
0.7237081408802206
1.4256831037952566
-0.6750765481220102
julia> X[2, :] = X[1, :]
10-element Array{Float64,1}:
0.9389947787349187
0.3277301697101178
0.7405555185711721
0.30032257202572477
0.9847899425069042
0.48807977638742295
0.7480061513093117
0.79808859136911
0.8641540973071822
0.8698636291189576
julia> ((X' * X) ^ -1) * X' * y
10×1 Array{Float64,2}:
0.7456524759867015
0.06233042922132548
2.5600126098899256
0.3182206475232786
-2.003080524452619
0.272673133766017
-0.8550165639656011
0.40827327221785403
0.2994419115664999
-0.37876151249955264
julia> X \ y
10×1 Array{Float64,2}:
3.852193379477664e15
-2.097948470376586e15
9.077766998701864e15
5.112094484728637e15
-5.798433818338726e15
-2.0446050874148052e15
-3.300267174800096e15
2.990882423309131e14
-4.214829360472345e15
1.60123572911982e15

How extreme values of a functional can be found using R?

I have a functional like this :
(LaTex formula: $v[y]=\int_0^2 (y'^2+23yy'+12y^2+3ye^{2t})dt$)
with given start and end conditions y(0)=-1, y(2)=18.
How can I find extreme values of this functional in R? I realize how it can be done for example in Excel but didn't find appropriate solution in R.
Before trying to solve such a task in a numerical setting, it might be better to lean back and think about it for a moment.
This is a problem typically treated in the mathematical discipline of "variational calculus". A necessary condition for a function y(t) to be an extremum of the functional (ie. the integral) is the so-called Euler-Lagrange equation, see
Calculus of Variations at Wolfram Mathworld.
Applying it to f(t, y, y') as the integrand in your request, I get (please check, I can easily have made a mistake)
y'' - 12*y + 3/2*exp(2*t) = 0
You can go now and find a symbolic solution for this differential equation (with the help of a textbook, or some CAS), or solve it numerically with the help of an R package such as 'deSolve'.
PS: Solving this as an optimization problem based on discretization is possible, but may lead you on a long and stony road. I remember solving the "brachistochrone problem" to a satisfactory accuracy only by applying several hundred variables (not in R).
Here is a numerical solution in R. First the functional:
f<-function(y,t=head(seq(0,2,len=length(y)),-1)){
len<-length(y)-1
dy<-diff(y)*len/2
y0<-(head(y,-1)+y[-1])/2
2*sum(dy^2+23*y0*dy+12*y0^2+3*y0*exp(2*t))/len
}
Now the function that does the actual optimization. The best results I got were using the BFGS optimization method, and parametrizing using dy rather than y:
findMinY<-function(points=100, ## number of points of evaluation
boundary=c(-1,18), ## boundary values
y0=NULL, ## optional initial value
method="Nelder-Mead", ## optimization method
dff=T) ## if TRUE, optimizes based on dy rather than y
{
t<-head(seq(0,2,len=points),-1)
if(is.null(y0) || length(y0)!=points)
y0<-seq(boundary[1],boundary[2],len=points)
if(dff)
y0<-diff(y0)
else
y0<-y0[-1]
y0<-head(y0,-1)
ff<-function(z){
if(dff)
y<-c(cumsum(c(boundary[1],z)),boundary[2])
else
y<-c(boundary[1],z,boundary[2])
f(y,t)
}
res<-optim(y0,ff,control=list(maxit=1e9),method=method)
cat("Iterations:",res$counts,"\n")
ymin<-res$par
if(dff)
c(cumsum(c(boundary[1],ymin)),boundary[2])
else
c(boundary[1],ymin,boundary[2])
}
With 500 points of evaluation, it only takes a few seconds with BFGS:
> system.time(yy<-findMinY(500,method="BFGS"))
Iterations: 90 18
user system elapsed
2.696 0.000 2.703
The resulting function looks like this:
plot(seq(0,2,len=length(yy)),yy,type='l')
And now a solution that numerically integrates the Euler equation.
As #HansWerner pointed out, this problem boils down to applying the Euler-Lagrange equation to the integrand in OP's question, and then solving that differential equation, either analytically or numerically. In this case the relevant ODE is
y'' - 12*y = 3/2*exp(2*t)
subject to:
y(0) = -1
y(2) = 18
So this is a boundary value problem, best approached using bvpcol(...) in package bvpSolve.
library(bvpSolve)
F <- function(t, y.in, pars){
dy <- y.in[2]
d2y <- 12*y.in[1] + 1.5*exp(2*t)
return(list(c(dy,d2y)))
}
init <- c(-1,NA)
end <- c(18,NA)
t <- seq(0, 2, by = 0.01)
sol <- bvpcol(yini = init, yend = end, x = t, func = F)
y = function(t){ # analytic solution...
b <- sqrt(12)
a <- 1.5/(4-b*b)
u <- exp(2*b)
C1 <- ((18*u + 1) - a*(exp(4)*u-1))/(u*u - 1)
C2 <- -1 - a - C1
return(a*exp(2*t) + C1*exp(b*t) + C2*exp(-b*t))
}
par(mfrow=c(1,2))
plot(t,y(t), type="l", xlim=c(0,2),ylim=c(-1,18), col="red", main="Analytical Solution")
plot(sol[,1],sol[,2], type="l", xlim=c(0,2),ylim=c(-1,18), xlab="t", ylab="y(t)", main="Numerical Solution")
It turns out that in this very simple example, there is an analytical solution:
y(t) = a * exp(2*t) + C1 * exp(sqrt(12)*t) + C2 * exp(-sqrt(12)*t)
where a = -3/16 and C1 and C2 are determined to satisfy the boundary conditions. As the plots show, the numerical and analytic solution agree completely, and also agree with the solution provided by #mrip

Solving for the inverse of a function in R

Is there any way for R to solve for the inverse of a given single variable function? The motivation is for me to later tell R to use a vector of values as inputs of the inverse function so that it can spit out the inverse function values.
For instance, I have the function y(x) = x^2, the inverse is y = sqrt(x). Is there a way R can solve for the inverse function?
I looked up uniroot(), but I am not solving for the zero of a function.
Any suggestions would be helpful.
Thanks!
What kind of inverse are you finding? If you're looking for a symbolic inverse (e.g., a function y that is identically equal to sqrt(x)) you're going to have to use a symbolic system. Look at ryacas for an R library to connect with a computer algebra system that can likely compute inverses, Yacas.
Now, if you need only to compute point-wise inverses, you can define your function in terms of uniroot as you've written:
> inverse = function (f, lower = -100, upper = 100) {
function (y) uniroot((function (x) f(x) - y), lower = lower, upper = upper)[1]
}
> square_inverse = inverse(function (x) x^2, 0.1, 100)
> square_inverse(4)
[1] 1.999976
For a given y and f(x), this will compute x such that f(x) = y, also known as the inverse.
I cannot comment as my reputation is too low.
I am a newbie to R, and it took me a while to understand Mike's code as I was not used to the way functions are defined in his answer.
Below is Mike's code in a longer, but (to me) easier readable notation:
inverse <- function(f, lower, upper){
function(y){
uniroot(function(x){f(x) - y}, lower = lower, upper = upper, tol=1e-3)[1]
}
}
square_inverse <- inverse(function(x){x^2}, 0.1, 100)
square_inverse(4)
I hope it helps others newbies as well.

Why is nlogn so hard to invert?

Let's say I have a function that is nlogn in space requirements, I want to work out the maximum size of input for that function for a given available space. i.e. I want to find n where nlogn=c.
I followed an approach to calculate n, that looks like this in R:
step = function(R, z) { log(log(R)-z)}
guess = function(R) log(log(R))
inverse_nlogn = function(R, accuracy=1e-10) {
zi_1 = 0
z = guess(R)
while(abs(z - zi_1)>accuracy) {
zi_1 = z
z = step(R, z)
}
exp(exp(z))
}
But I can't get understand why it must be solved iteratively. For the range we are interested (n>1), the function is non singular.
There's nothing special about n log n — nearly all elementary functions fail to have elementary inverses, and so have to be solved by some other means: bisection, Newton's method, Lagrange inversion theorem, series reversion, Lambert W function...
As Gareth hinted the Lambert W function (eg here) gets you almost there, indeed n = c/W(c)
A wee google found this, which might be helpful.
Following up (being completely explicit):
library(emdbook)
n <- 2.5
c <- 2.5*log(2.5)
exp(lambertW(c)) ## 2.5
library(gsl)
exp(lambert_W0(c)) ## 2.5
There are probably minor differences in speed, accuracy, etc. of the two implementations. I haven't tested/benchmarked them extensively. (Now that I tried
library(sos)
findFn("lambert W")
I discover that it's implemented all over the place: the games package, and a whole package that's called LambertW ...

Resources