Integrating over a PCHIP Function - r

How can I integrate over a PCHIP (Piecewise Cubic Hermite Interpolation Polynomial) function in R? pchip {pracma} returns interpolated point data, and to integrate we of course need a function. I see under the help menu for pchip(), "TODO: A `pchipfun' should be provided," I don't know how hard this would be to generate manually? Any other suggestions? You could fit an nth degree polynomial regression to the interpolated points and integrate off that to get a rough approximation, but that gets messy pretty quick...
Here's the source code for pchip {pracma} which returns points and not a function, I suppose returning a function is more of a math question not an R question, but I'm open for any and all suggestions! Please!
function (xi, yi, x)
{
h <- diff(xi)
delta <- diff(yi)/h
d <- .pchipslopes(h, delta)
n <- length(xi)
a <- (3 * delta - 2 * d[1:(n - 1)] - d[2:n])/h
b <- (d[1:(n - 1)] - 2 * delta + d[2:n])/h^2
k <- rep(1, length(x))
for (j in 2:(n - 1)) {
k[xi[j] <= x] <- j
}
s <- x - xi[k]
v <- yi[k] + s * (d[k] + s * (a[k] + s * b[k]))
return(v)
}
Thanks!

What does not work for you? You have to define a function using pchipfun() like this:
> library(pracma)
> xs <- linspace(0, pi, 10)
> ys <- sin(xs)
> pchipfun <- function(xi, yi) function(x) pchip(xi, yi, x)
> f <- pchipfun(xs, ys)
> integrate(f, 0, pi)
2.000749 with absolute error < 0.00017
I have updated pracma 1.7.2 on R-Forge to include pchipfun()
and added some error checking to pchip().

Related

Brownian Motion / loop in R

I want to implete the function of the Wiener representation in R (see https://en.wikipedia.org/wiki/Wiener_process#Wiener_representation). (I want to implement the first formulae) When plotting this
function it should look more similar to the standard brownian motion the higher the dimension of the random vector is, and the lower it should look smoother.
I have tried to implement it, but I think there is a mistake somewhere in the loop, because the graphs do not should look much more like a brownian motion when n is high, I even went as high as 10000 there isn't enough fluctation inside each graph
brownmotion <- function(n, time=1000){
W <- rep(0, time)
Wp1 <- rep(0, time)
Wp2 <- 0
X <- seq(0, 1, length.out = time)
xsi <- rnorm(n)
for ( i in 1:length(X)){
for (j in 1:n){
Wp1[i] <- X[i]*xsi[1]
Wp2 <- Wp2 + xsi[j]*sin(j*X[i]*pi)/(j*pi)
W[i] <- Wp1[i] + sqrt(2)*Wp2
}
}
return (W)
}
Since this is R, this is better done without loops:
brownmotion <- function(n, time=1000){
X <- seq(0, 1, length.out = time)
xsi <- rnorm(n + 1)
W <- xsi[1] * X + sqrt(2) * colSums(xsi[-1] * sin(pi * 1:n %*% t(X)) / (pi * 1:n))
return (W)
}
When coding this, I noticed a small error in your original code in that you use xsi[1] twice. I avoided this by making xsi length n + 1, so xsi[1] could be the initial value and there are still n values left.

Non-comformable arguments in R

I am re-writting an algorithm I did in C++ in R for practice called the Finite Difference Method. I am pretty new with R so I don't know all the rules regarding vector/matrix multiplication. For some reason I am getting a non-conformable arguments error when I do this:
ST_u <- matrix(0,M,1)
ST_l <- matrix(0,M,1)
for(i in 1:M){
Z <- matrix(gaussian_box_muller(i),M,1)
ST_u[i] <- (S0 + delta_S)*exp((r - (sigma*sigma)/(2.0))*T + sigma*sqrt(T)%*%Z)
ST_l[i] <- (S0 - delta_S)*exp((r - (sigma*sigma)/(2.0))*T + sigma*sqrt(T)%*%Z)
}
I get this error:
Error in sqrt(T) %*% Z : non-conformable arguments
Here is my whole code:
gaussian_box_muller <- function(n){
theta <- runif(n, 0, 2 * pi)
rsq <- rexp(n, 0.5)
x <- sqrt(rsq) * cos(theta)
return(x)
}
d_j <- function(j, S, K, r, v,T) {
return ((log(S/K) + (r + (-1^(j-1))*0.5*v*v)*T)/(v*(T^0.5)))
}
call_delta <- function(S,K,r,v,T){
return (S * dnorm(d_j(1, S, K, r, v, T))-K*exp(-r*T) * dnorm(d_j(2, S, K, r, v, T)))
}
Finite_Difference <- function(S0,K,r,sigma,T,M,delta_S){
ST_u <- matrix(0,M,1)
ST_l <- matrix(0,M,1)
for(i in 1:M){
Z <- matrix(gaussian_box_muller(i),M,1)
ST_u[i] <- (S0 + delta_S)*exp((r - (sigma*sigma)/(2.0))*T + sigma*sqrt(T)%*%Z)
ST_l[i] <- (S0 - delta_S)*exp((r - (sigma*sigma)/(2.0))*T + sigma*sqrt(T)%*%Z)
}
Delta <- matrix(0,M,1)
totDelta <- 0
for(i in 1:M){
if(ST_u[i] - K > 0 && ST_l[i] - K > 0){
Delta[i] <- ((ST_u[i] - K) - (ST_l[i] - K))/(2*delta_S)
}else{
Delta <- 0
}
totDelta = totDelta + exp(-r*T)*Delta[i]
}
totDelta <- totDelta * 1/M
Var <- 0
for(i in 1:M){
Var = Var + (Delta[i] - totDelta)^2
}
Var = Var*1/M
cat("The Finite Difference Delta is : ", totDelta)
call_Delta_a <- call_delta(S,K,r,sigma,T)
bias <- abs(call_Delta_a - totDelta)
cat("The bias is: ", bias)
cat("The Variance of the Finite Difference method is: ", Var)
MSE <- bias*bias + Var
cat("The marginal squared error is thus: ", MSE)
}
S0 <- 100.0
delta_S <- 0.001
K <- 100.0
r <- 0.05
sigma <- 0.2
T <- 1.0
M <- 10
result1 <- Finite_Difference(S0,K,r,sigma,T,M,delta_S)
I can't seem to figure out the problem, any suggestions would be greatly appreciated.
In R, the %*% operator is reserved for multiplying two conformable matrices. As one special case, you can also use it to multiply a vector by a matrix (or vice versa), if the vector can be treated as a row or column vector that conforms to the matrix; as a second special case, it can be used to multiply two vectors to calculate their inner product.
However, one thing it cannot do is perform scalar multipliciation. Scalar multiplication of vectors or matrices always uses the plain * operator. Specifically, in the expression sqrt(T) %*% Z, the first term sqrt(T) is a scalar, and the second Z is a matrix. If what you intend to do here is multiply the matrix Z by the scalar sqrt(T), then this should just be written sqrt(T) * Z.
When I made this change, your program still didn't work because of another bug -- S is used but never defined -- but I don't understand your algorithm well enough to attempt a fix.
A few other comments on the program not directly related to your original question:
The first loop in Finite_Difference looks suspicious: guassian_box_muller(i) generates a vector of length i as i varies in the loop from 1 up to M, and forcing these vectors into a column matrix of length M to generate Z is probably not doing what you want. It will "reuse" the values in a cycle to populate the matrix. Try these to see what I mean:
matrix(gaussian_box_muller(1),10,1) # all one value
matrix(gaussian_box_muller(3),10,1) # cycle of three values
You also use loops in many places where R's vector operations would be easier to read and (typically) faster to execute. For example, your definition of Var is equivalent to:
Var <- sum((Delta - totDelta)^2)/M
and the definitions of Delta and totDelta could also be written in this simplified fashion.
I'd suggest Googling for "vector and matrix operations in r" or something similar and reading some tutorials. Vector arithmetic in particular is idiomatic R, and you'll want to learn it early and use it often.
You might find it helpful to consider the rnorm function to generate random Gaussians.
Happy R-ing!

fzero function from MATLAB to R

I want to use R's fzero function to find roots of a function. The problem gets complicated, as the function in question calls some other functions which in turn call another ones. I do have MATLAB code that does it and I am trying to translate it to R, but cannot make in work. My experience with MATLAB is limited, so it's probable I just missed some feature of the MATLAB code while translating. My ultimate goal is to obtain R's working equivalent of the MATLAB code. Any hints will by highly appreciated!
The error I got is in function psi():
Error in (-t(I) * pi^2) %*% time : non-conformable arguments
Although the sizes of matrices do match and this part of code works with some naive input when ran in isolation.
NB: I have tried using mrdivide (R's equivalent of MATLAB's right matrix division) in some places, but with no effect.
NB2: I obtain the same error trying function uniroot instead of fzero.
# Global parameters:
N = 140
A2 = (256 times 256) matrix with data
I = vector of size 256: (0, 1, 2^2, 3^2, 4^2, ..., 255^2)
# ----------------------------------------------------------------
MATLAB working code:
fzero( #(t)(t-evolve(t)),[0,0.1])
function [out,time]=evolve(t)
global N
Sum_func = func([0,2],t) + func([2,0],t) + 2*func([1,1],t);
time=(2*pi*N*Sum_func)^(-1/3);
out=(t-time)/time;
end
function out=func(s,t)
global N
if sum(s)<=4
Sum_func=func([s(1)+1,s(2)],t)+func([s(1),s(2)+1],t); const=
(1+1/2^(sum(s)+1))/3;
time=(-2*const*K(s(1))*K(s(2))/N/Sum_func)^(1/(2+sum(s)));
out=psi(s,time);
else
out=psi(s,t);
end
end
function out=psi(s,Time)
global I A2
% s is a vector
w=exp(-I*pi^2*Time).*[1,.5*ones(1,length(I)-1)];
wx=w.*(I.^s(1));
wy=w.*(I.^s(2));
out=(-1)^sum(s)*(wy*A2*wx')*pi^(2*sum(s));
end
function out=K(s)
out=(-1)^s*prod((1:2:2*s-1))/sqrt(2*pi);
end
# ----------------------------------------------------------------
My attempt at R translation (not working):
fzero(subtract_evolve, c(0, 0.1))
K <- function(s) {
out <- (-1)^s * prod(seq(from = 1,to = 2*s-1, by = 2))/sqrt(2*pi)
return(out)
}
psi <- function(s, time) {
w <- (exp((-t(I) * pi^2) %*% time)) *
t(c(cbind(1, 0.5*ones(1,length(I)-1))))
wx <- t(w * (I^s[1]))
wy <- t(w * (I^s[2]))
out <- (-1)^sum(s) * (wy %*% A2 %*% t(wx)) * pi^(2*sum(s))
return(out)
}
func <- function(s, t) {
if (sum(s) <= 4) {
sum_func <- func(c(s[1]+1,s[2]), t) + func(c(s[1],s[2]+1), t)
const <- (1+1/2^(sum(s)+1))/3
time <- (-2 * const * K(s[1]) * K(s[2]) / N / sum_func)^(1/(2+sum(s)))
out <- psi(s, time)
} else {
out <- psi(s, t)
}
return(out)
}
evolve <- function(t) {
sum_func = func(c(0,2), t) + func(c(2,0), t) + 2*func(c(1,1),t)
time <- (2*pi*N*Sum_func)^(-1/3)
out <- (t-time)/time
return(c(out, time))
}
subtract_evolve <- function(t) {
return(t - evolve(t))
}

efficient way of calculating lots of matrices

I'm trying to write a program that does the following:
Given two intervals A and B, for every (a,b) with a in A and b in B
create a variance matrix ymat, depending on (a,b)
calculate the (multivariate normal) density of some vector y
with mean 0 and variance matrix ymat
I learned that using loops is bad in R, so I wanted to use outer(). Here are my two functions:
y_mat <- function(n,lambda,theta,sigma) {
L <- diag(n);
L[row(L) == col(L) + 1] <- -1;
K <- t(1/n * L - theta*diag(n))%*%(1/n * L - theta*diag(n));
return(sigma^2*diag(n) + 1/lambda*K);
}
make_plot <- function(y,sigma,theta,lambda) {
n <- length(y)
sig_intv <- seq(.1,2*sigma,.01);
th_intv <- seq(-abs(2*theta),abs(2*theta),.01);
z <- outer(sig_intv,th_intv,function(s,t){dmvnorm(y,rep(0,n),y_mat(n,lambda,theta=t,sigma=s))})
contour(sig_intv,th_intv,z);
}
The shape of the variance matrix isn't relevant for this question. n and lambda are just two scalars, as are sigma and theta.
When I try
make_plot(y,.5,-3,10)
I get the following error message:
Error in t(1/n * L - theta * diag(n)) :
dims [product 25] do not match the length of object [109291]
In addition: Warning message:
In theta * diag(n) :
longer object length is not a multiple of shorter object length
Could someone enlighten me as to what's going wrong? Am I maybe going about this the wrong way?
The third argument of outer should be a vectorized function. Wrapping it with Vectorize should suffice:
make_plot <- function(y, sigma, theta, lambda) {
n <- length(y)
sig_intv <- seq(.1,2*sigma,.01);
th_intv <- seq(-abs(2*theta),abs(2*theta),.01);
z <- outer(
sig_intv, th_intv,
Vectorize(function(s,t){dmvnorm(y,rep(0,n),y_mat(n,lambda,theta=t,sigma=s))})
)
contour(sig_intv,th_intv,z);
}

Jensen Shannon divergence in R

I am new to R and was trying to find a function which calculates JS divergence in R.
I can see that R has KLdiv for calculating KL divergence, but is there anything available for JS divergence?
I was looking for a simple implementation of the JS divergence rather than an R library. Since I did not see one in any of the responses, I came up with the one below.
Assuming we have the following input distributions:
# p & q are distributions so their elements should sum up to 1
p <- c(0.00029421, 0.42837957, 0.1371827, 0.00029419, 0.00029419,
0.40526004, 0.02741252, 0.00029422, 0.00029417, 0.00029418)
q <- c(0.00476199, 0.004762, 0.004762, 0.00476202, 0.95714168,
0.00476213, 0.00476212, 0.00476202, 0.00476202, 0.00476202)
The Jensen-Shannon divergence would be:
n <- 0.5 * (p + q)
JS <- 0.5 * (sum(p * log(p / n)) + sum(q * log(q / n)))
> JS
[1] 0.6457538
For more than 2 distributions (which has already been discussed here) we need a function to compute the Entropy:
H <- function(v) {
v <- v[v > 0]
return(sum(-v * log(v)))
}
Then the JS divergence would be:
JSD <- function(w, m) {
return(H(m %*% w) - apply(m, 2, H) %*% w)
}
> JSD(w = c(1/3, 1/3, 1/3), m = cbind(p, q, n))
[,1]
[1,] 0.4305025
Where w is a vector of weights which should sum up to 1 and m is a matrix with the input distributions as columns.
In case anyone is still searching for an answer (I was), there is a function to calculate this in the R package phyloseq: http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0061217
I also found this tutorial useful: http://enterotype.embl.de/enterotypes.html
According to wikipedia Jensen-Shannon divergence is a transformation of the KL divergence. Applying the formula from the definition should give you the JS divergence then...
See: http://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence

Resources