Density Estimation on Single Point - r

I am trying to plot the densities by using non-parametric kernel estimation method. For this purpose 'm using asymmetrical kernels like Weibull, Gamma, Inverse Gaussian, etc. i have successfully plotted the densities as given in their articles. My query is about those graphs which are plotted on single point of x, specially x=0 as given in Birnbaum-Saunders and Lognormal Kernel Estimators for
Modelling Durations in High Frequency Financial Data on page # 108 and also some others presented the same thing. Simple density plots can be plotted by using following R code.
Kindly guide me how to plot on specific point.
n <- 200
k <- 400
y <- rexp(n, 1)
h <- 0.79 * IQR(y) * length(y) ^ (-1/5)
x <- seq(min(y) + 0.05, max(y), length=k)
Kbs <- matrix(rep(0, k * n), ncol=k)
fhat <- rep(0, k)
########### BS ###########
for (j in 1:k) {
for (i in 1:n) {
Kbs[i, j] <- (1 / (2*sqrt(2*h*pi))) *
((sqrt(1 / (x[j]*y[i]))) + (sqrt(x[j] / (y[i]^3)))) *
exp(- (y[i] / (2*h*x[j])) + (1/h) - (x[j]/(2*h*y[i])))
Kbs[is.nan(Kbs)] <- 0
}
fhat[j] <- 1/n * (sum(Kbs[, j]))
}
d1 <- density(y, bw=h)
plot(x, fhat, type="s", ylab="Density Function", lty=1, xlab="Time")
lines(d1, type="p", col="red")
legend("topright", c("Real Density", "Density by Birnbaum-Saunders Kernel"),
col=c("red", "black"), lty=c(1, 2))
Regards.

Related

How to get the correct output when using batch forecasting for the monthly time series with the Mcomp package?

I'm trying to do batch forecasting for the monthly time series with the Mcomp package. I've prepared a code but I am not getting any output.
library(forecast)
library(Mcomp)
Using the seq function, as I need to select the particular time series which ends with 7.
tsset <- (seq(1507, 2797, 10))
tsset
horizon <- 18
fit1<-array(0,130)
for (tsi in 1:130){
y <- tsset[[tsi]]$x
yt <- head(y, length(y) - horizon)
yv <- tail(y, horizon)
for(i in 1:130){
fit1 <-c(ets(yt))
}
print(fit1)
}
Here is how you get the prediction for the last 18 points, given the first 112 points in the time series (you don't need loops):
tsset<- seq(1507, 2797, 10) + 10*runif(130) # add noise
horizon <- 18
y <- tsset
n <- length(y)
yt <- head(y, n - horizon)
#yv <- tail(y, horizon)
fit1 <- ets(yt)
yv1 <- forecast(fit1, h=horizon)
start <- n - horizon + 1
plot(start:n, yv, type='l', col='red', lty=1, xlab='t', ylab='y(t)')
lines(start:n, yv1$mean, col='blue', lty=2)
lines(start:n, yv1$upper[,2], col='green', lty=2)
lines(start:n, yv1$lower[,2], col='green', lty=2)
legend("topleft", legend=c("original", "forecast", "95% CI"),
col=c("red", "blue", "green"), lty=c(1,2,2), cex=0.8)

Plot the likelihood of weibull

I would like to plot the likelihood function of a size 1000 weibull sample with a sequence of shape parameter theta. I have used standardised weibull so the scale lambda is 1. However the output is a horizontal straight line.
n<-1000
lik <- function(theta, x){
K<- length(theta)
n<- length(x)
out<- rep(0,K)
for(k in 1:K){
out[k] <- prod(dweibull(x, shape= theta[k], scale=1))
}
return(out)
}
theta<-seq(0.01, 10, by = 0.01)
x <- rweibull(n, shape= 0.5, scale= 1)
plot(theta, lik(theta, x), type="l", lwd=2)
There is nothing really wrong about what you have done but computers struggle to calculate the product of many small numbers and so can end up as zero (even 0.99^1000 = 4^-5). And so it is easier to log transform and then sum. (As the log transform is a monotonic increasing function maximising the log-likelihood is the same as maximising the likelihood).Thus change
prod(dweibull(x, shape= theta[k], scale=1))
to
sum(dweibull(x, shape= theta[k], scale=1, log=TRUE))
The other minor change is to plot the likelihood witihin a reasonable range of theta so that
you can see the curve.
Working code:
set.seed(1)
n<-1000
lik <- function(theta, x){
K <- length(theta)
n <- length(x)
out <- rep(0,K)
for(k in 1:K){
out[k] <- sum(dweibull(x, shape= theta[k], scale=1, log=TRUE))
}
return(out)
}
popTheta = 0.5
theta = seq(0.01, 1.5, by = 0.01)
x = rweibull(n, shape=popTheta, scale= 1)
plot(theta, lik(theta, x), type="l", lwd=2)
abline(v=popTheta)
theta[which.max( lik(theta, x))]

Conway Maxwell Distribution Density Plot

I have written my own code to simulate the Conway maxwell distribution sample.
This is the pmf (Guikema & Goffelt, 2008):
However, I have met some problem to plot the density plot.
rcomp <- function(n,lamb,v)
{
u <- runif(n)
w <- integer(n)
for(i in 1:n) {
z=sum(sapply( 0:100, function(j) (( ((lamb)^j) / (factorial(j)) )^v) ))
x <- seq(1, 50, 1) #seq of 1 to 50, increase by 1
px <- (((lamb^x)/factorial(x))^v)/z
# px is pmf of re-parameter conway maxwell
w[i] <- if (u[i] < px[1]) 0 else (max (which (cumsum(px) <= u[i])))
}
return (w)
}
dcomp <- function(x,lamb,v) {
z=sum(sapply( 0:100, function(j) (( ((lamb)^j) / (factorial(j)) )^v) ))
px <- (((lamb^x)/factorial(x))^v)/z
return(px)
}
As I wanna plot the density plot to check whether lamb or v is location parameter, the plot I get is weird.
x = rcomp(100,6,0.2); pdf = dcomp(x,6,0.2)
x1 = rcomp(100,6,0.5); pdf1 = dcomp(x1,6,0.5)
x2 = rcomp(100,6,0.7); pdf2 = dcomp(x2,6,0.7)
plot(x2, pdf2, type="l", lwd=1,lty=1,col="blue")
How could I solve this problem?
Source: Guikema & Goffelt (2008), A Flexible Count Data Regression Model for Risk Analysis. Risk Analysis 28(1): 215.
You have to sort the values of the x coordinate if you want a graph to connect the points in their axis order.
Note, however, that there might be better ways to graph the density you want. See the red curve. I first create a vector x of values within a certain range and then compute the PDF for those values. These pairs (x, y) are what function lines plots.
set.seed(2673) # Make the results reproducible
x2 <- rcomp(100, 6, 0.7)
x2 <- sort(x2)
pdf2 <- dcomp(x2, 6, 0.7)
plot(x2, pdf2, type = "l", lwd = 1, lty = 1, col = "blue")
x <- seq(0, 50, length.out = 100)
y <- dcomp(x, 6, 0.2)
lines(x, y, type = "l", col = "red")

R: Contour plot for each component of a fitted bivariate mixture

(Please note: I'm using R for only two days now.)
I have a dataset data that looks like this:
plot(data, pch=20, xlim=c(-2,3), ylim=c(-1,2))
I'm using the mixsmsn package to fit a mixture of bivariate skew-normal distributions:
sn2 <- smsn.mmix(data, nu=3, g=2, get.init=TRUE, criteria=TRUE, group=TRUE, family="Skew.normal", error=1e-08, iter.max=10000)
I can plot it like this (why pch=20 doesn't work?):
mix.contour(data, sn2, pch=20, xlim=c(-2,3), ylim=c(-1,2), levels=c(0.1,0.25,0.5))
How can I achieve the following?
I'd want to draw a contour separately for each component at half its height. That is, say it's a mixture distribution of the form p f_1(x,y) + (1-p) f_2(x,y) (f_i being the pdf of the _i_th skew-normal component); I'd want to draw (on a scatter plot) a contour of the f_1 component at half its height, and a second contour related to f_2 at half its height; I'd like the result to look like this:
Using the fMultivar package, I came up with this:
X <- data
sn2 <- smsn.mmix(X, nu=3, g=2, get.init=TRUE, criteria=TRUE, group=TRUE, family="Skew.normal", error=1e-08, iter.max=10000)
mu1 <- sn2$mu[[1]]
sigma1 <- sn2$Sigma[[1]]
alpha1 <- c(sn2$shape[[1]][1], sn2$shape[[1]][2])
p1 <- sn2$pii[[1]]
mu2 <- sn2$mu[[2]]
sigma2 <- sn2$Sigma[[2]]
alpha2 <- c(sn2$shape[[2]][1], sn2$shape[[2]][2])
p2 <- sn2$pii[[2]]
N <- 101
x <- seq(min(X[, 1]), max(X[, 1]), l=N)
y <- seq(min(X[, 2]), max(X[, 2]), l=N)
u <- grid2d(x, y)$x
v <- grid2d(x, y)$y
XY <- cbind(u, v)
Z1 <- matrix(p1*dmsn(XY, mu1, sigma1, alpha1), ncol=N)
Z2 <- matrix(p2*dmsn(XY, mu2, sigma2, alpha2), ncol=N)
c1 <- 0.5*max(Z1)
c2 <- 0.5*max(Z2)
plot(X, pch=20, xlim=c(-2,3), ylim=c(-1,2))
contour(x, y, Z1, add=TRUE, col="red", lwd=3, levels=c(c1), labels="")
contour(x, y, Z2, add=TRUE, col="green", lwd=3, levels=c(c2), labels="")

Getting values from kernel density estimation in R

I am trying to get density estimates for the log of stock prices in R. I know I can plot it using plot(density(x)). However, I actually want values for the function.
I'm trying to implement the kernel density estimation formula. Here's what I have so far:
a <- read.csv("boi_new.csv", header=FALSE)
S = a[,3] # takes column of increments in stock prices
dS=S[!is.na(S)] # omits first empty field
N = length(dS) # Sample size
rseed = 0 # Random seed
x = rep(c(1:5),N/5) # Inputted data
set.seed(rseed) # Sets random seed for reproducibility
QL <- function(dS){
h = density(dS)$bandwidth
r = log(dS^2)
f = 0*x
for(i in 1:N){
f[i] = 1/(N*h) * sum(dnorm((x-r[i])/h))
}
return(f)
}
QL(dS)
Any help would be much appreciated. Been at this for days!
You can pull the values directly from the density function:
x = rnorm(100)
d = density(x, from=-5, to = 5, n = 1000)
d$x
d$y
Alternatively, if you really want to write your own kernel density function, here's some code to get you started:
Set the points z and x range:
z = c(-2, -1, 2)
x = seq(-5, 5, 0.01)
Now we'll add the points to a graph
plot(0, 0, xlim=c(-5, 5), ylim=c(-0.02, 0.8),
pch=NA, ylab="", xlab="z")
for(i in 1:length(z)) {
points(z[i], 0, pch="X", col=2)
}
abline(h=0)
Put Normal density's around each point:
## Now we combine the kernels,
x_total = numeric(length(x))
for(i in 1:length(x_total)) {
for(j in 1:length(z)) {
x_total[i] = x_total[i] +
dnorm(x[i], z[j], sd=1)
}
}
and add the curves to the plot:
lines(x, x_total, col=4, lty=2)
Finally, calculate the complete estimate:
## Just as a histogram is the sum of the boxes,
## the kernel density estimate is just the sum of the bumps.
## All that's left to do, is ensure that the estimate has the
## correct area, i.e. in this case we divide by $n=3$:
plot(x, x_total/3,
xlim=c(-5, 5), ylim=c(-0.02, 0.8),
ylab="", xlab="z", type="l")
abline(h=0)
This corresponds to
density(z, adjust=1, bw=1)
The plots above give:

Resources