I am trying to plot the densities by using non-parametric kernel estimation method. For this purpose 'm using asymmetrical kernels like Weibull, Gamma, Inverse Gaussian, etc. i have successfully plotted the densities as given in their articles. My query is about those graphs which are plotted on single point of x, specially x=0 as given in Birnbaum-Saunders and Lognormal Kernel Estimators for
Modelling Durations in High Frequency Financial Data on page # 108 and also some others presented the same thing. Simple density plots can be plotted by using following R code.
Kindly guide me how to plot on specific point.
n <- 200
k <- 400
y <- rexp(n, 1)
h <- 0.79 * IQR(y) * length(y) ^ (-1/5)
x <- seq(min(y) + 0.05, max(y), length=k)
Kbs <- matrix(rep(0, k * n), ncol=k)
fhat <- rep(0, k)
########### BS ###########
for (j in 1:k) {
for (i in 1:n) {
Kbs[i, j] <- (1 / (2*sqrt(2*h*pi))) *
((sqrt(1 / (x[j]*y[i]))) + (sqrt(x[j] / (y[i]^3)))) *
exp(- (y[i] / (2*h*x[j])) + (1/h) - (x[j]/(2*h*y[i])))
Kbs[is.nan(Kbs)] <- 0
}
fhat[j] <- 1/n * (sum(Kbs[, j]))
}
d1 <- density(y, bw=h)
plot(x, fhat, type="s", ylab="Density Function", lty=1, xlab="Time")
lines(d1, type="p", col="red")
legend("topright", c("Real Density", "Density by Birnbaum-Saunders Kernel"),
col=c("red", "black"), lty=c(1, 2))
Regards.
Related
I'm trying to do batch forecasting for the monthly time series with the Mcomp package. I've prepared a code but I am not getting any output.
library(forecast)
library(Mcomp)
Using the seq function, as I need to select the particular time series which ends with 7.
tsset <- (seq(1507, 2797, 10))
tsset
horizon <- 18
fit1<-array(0,130)
for (tsi in 1:130){
y <- tsset[[tsi]]$x
yt <- head(y, length(y) - horizon)
yv <- tail(y, horizon)
for(i in 1:130){
fit1 <-c(ets(yt))
}
print(fit1)
}
Here is how you get the prediction for the last 18 points, given the first 112 points in the time series (you don't need loops):
tsset<- seq(1507, 2797, 10) + 10*runif(130) # add noise
horizon <- 18
y <- tsset
n <- length(y)
yt <- head(y, n - horizon)
#yv <- tail(y, horizon)
fit1 <- ets(yt)
yv1 <- forecast(fit1, h=horizon)
start <- n - horizon + 1
plot(start:n, yv, type='l', col='red', lty=1, xlab='t', ylab='y(t)')
lines(start:n, yv1$mean, col='blue', lty=2)
lines(start:n, yv1$upper[,2], col='green', lty=2)
lines(start:n, yv1$lower[,2], col='green', lty=2)
legend("topleft", legend=c("original", "forecast", "95% CI"),
col=c("red", "blue", "green"), lty=c(1,2,2), cex=0.8)
I would like to plot the likelihood function of a size 1000 weibull sample with a sequence of shape parameter theta. I have used standardised weibull so the scale lambda is 1. However the output is a horizontal straight line.
n<-1000
lik <- function(theta, x){
K<- length(theta)
n<- length(x)
out<- rep(0,K)
for(k in 1:K){
out[k] <- prod(dweibull(x, shape= theta[k], scale=1))
}
return(out)
}
theta<-seq(0.01, 10, by = 0.01)
x <- rweibull(n, shape= 0.5, scale= 1)
plot(theta, lik(theta, x), type="l", lwd=2)
There is nothing really wrong about what you have done but computers struggle to calculate the product of many small numbers and so can end up as zero (even 0.99^1000 = 4^-5). And so it is easier to log transform and then sum. (As the log transform is a monotonic increasing function maximising the log-likelihood is the same as maximising the likelihood).Thus change
prod(dweibull(x, shape= theta[k], scale=1))
to
sum(dweibull(x, shape= theta[k], scale=1, log=TRUE))
The other minor change is to plot the likelihood witihin a reasonable range of theta so that
you can see the curve.
Working code:
set.seed(1)
n<-1000
lik <- function(theta, x){
K <- length(theta)
n <- length(x)
out <- rep(0,K)
for(k in 1:K){
out[k] <- sum(dweibull(x, shape= theta[k], scale=1, log=TRUE))
}
return(out)
}
popTheta = 0.5
theta = seq(0.01, 1.5, by = 0.01)
x = rweibull(n, shape=popTheta, scale= 1)
plot(theta, lik(theta, x), type="l", lwd=2)
abline(v=popTheta)
theta[which.max( lik(theta, x))]
I have written my own code to simulate the Conway maxwell distribution sample.
This is the pmf (Guikema & Goffelt, 2008):
However, I have met some problem to plot the density plot.
rcomp <- function(n,lamb,v)
{
u <- runif(n)
w <- integer(n)
for(i in 1:n) {
z=sum(sapply( 0:100, function(j) (( ((lamb)^j) / (factorial(j)) )^v) ))
x <- seq(1, 50, 1) #seq of 1 to 50, increase by 1
px <- (((lamb^x)/factorial(x))^v)/z
# px is pmf of re-parameter conway maxwell
w[i] <- if (u[i] < px[1]) 0 else (max (which (cumsum(px) <= u[i])))
}
return (w)
}
dcomp <- function(x,lamb,v) {
z=sum(sapply( 0:100, function(j) (( ((lamb)^j) / (factorial(j)) )^v) ))
px <- (((lamb^x)/factorial(x))^v)/z
return(px)
}
As I wanna plot the density plot to check whether lamb or v is location parameter, the plot I get is weird.
x = rcomp(100,6,0.2); pdf = dcomp(x,6,0.2)
x1 = rcomp(100,6,0.5); pdf1 = dcomp(x1,6,0.5)
x2 = rcomp(100,6,0.7); pdf2 = dcomp(x2,6,0.7)
plot(x2, pdf2, type="l", lwd=1,lty=1,col="blue")
How could I solve this problem?
Source: Guikema & Goffelt (2008), A Flexible Count Data Regression Model for Risk Analysis. Risk Analysis 28(1): 215.
You have to sort the values of the x coordinate if you want a graph to connect the points in their axis order.
Note, however, that there might be better ways to graph the density you want. See the red curve. I first create a vector x of values within a certain range and then compute the PDF for those values. These pairs (x, y) are what function lines plots.
set.seed(2673) # Make the results reproducible
x2 <- rcomp(100, 6, 0.7)
x2 <- sort(x2)
pdf2 <- dcomp(x2, 6, 0.7)
plot(x2, pdf2, type = "l", lwd = 1, lty = 1, col = "blue")
x <- seq(0, 50, length.out = 100)
y <- dcomp(x, 6, 0.2)
lines(x, y, type = "l", col = "red")
(Please note: I'm using R for only two days now.)
I have a dataset data that looks like this:
plot(data, pch=20, xlim=c(-2,3), ylim=c(-1,2))
I'm using the mixsmsn package to fit a mixture of bivariate skew-normal distributions:
sn2 <- smsn.mmix(data, nu=3, g=2, get.init=TRUE, criteria=TRUE, group=TRUE, family="Skew.normal", error=1e-08, iter.max=10000)
I can plot it like this (why pch=20 doesn't work?):
mix.contour(data, sn2, pch=20, xlim=c(-2,3), ylim=c(-1,2), levels=c(0.1,0.25,0.5))
How can I achieve the following?
I'd want to draw a contour separately for each component at half its height. That is, say it's a mixture distribution of the form p f_1(x,y) + (1-p) f_2(x,y) (f_i being the pdf of the _i_th skew-normal component); I'd want to draw (on a scatter plot) a contour of the f_1 component at half its height, and a second contour related to f_2 at half its height; I'd like the result to look like this:
Using the fMultivar package, I came up with this:
X <- data
sn2 <- smsn.mmix(X, nu=3, g=2, get.init=TRUE, criteria=TRUE, group=TRUE, family="Skew.normal", error=1e-08, iter.max=10000)
mu1 <- sn2$mu[[1]]
sigma1 <- sn2$Sigma[[1]]
alpha1 <- c(sn2$shape[[1]][1], sn2$shape[[1]][2])
p1 <- sn2$pii[[1]]
mu2 <- sn2$mu[[2]]
sigma2 <- sn2$Sigma[[2]]
alpha2 <- c(sn2$shape[[2]][1], sn2$shape[[2]][2])
p2 <- sn2$pii[[2]]
N <- 101
x <- seq(min(X[, 1]), max(X[, 1]), l=N)
y <- seq(min(X[, 2]), max(X[, 2]), l=N)
u <- grid2d(x, y)$x
v <- grid2d(x, y)$y
XY <- cbind(u, v)
Z1 <- matrix(p1*dmsn(XY, mu1, sigma1, alpha1), ncol=N)
Z2 <- matrix(p2*dmsn(XY, mu2, sigma2, alpha2), ncol=N)
c1 <- 0.5*max(Z1)
c2 <- 0.5*max(Z2)
plot(X, pch=20, xlim=c(-2,3), ylim=c(-1,2))
contour(x, y, Z1, add=TRUE, col="red", lwd=3, levels=c(c1), labels="")
contour(x, y, Z2, add=TRUE, col="green", lwd=3, levels=c(c2), labels="")
I am trying to get density estimates for the log of stock prices in R. I know I can plot it using plot(density(x)). However, I actually want values for the function.
I'm trying to implement the kernel density estimation formula. Here's what I have so far:
a <- read.csv("boi_new.csv", header=FALSE)
S = a[,3] # takes column of increments in stock prices
dS=S[!is.na(S)] # omits first empty field
N = length(dS) # Sample size
rseed = 0 # Random seed
x = rep(c(1:5),N/5) # Inputted data
set.seed(rseed) # Sets random seed for reproducibility
QL <- function(dS){
h = density(dS)$bandwidth
r = log(dS^2)
f = 0*x
for(i in 1:N){
f[i] = 1/(N*h) * sum(dnorm((x-r[i])/h))
}
return(f)
}
QL(dS)
Any help would be much appreciated. Been at this for days!
You can pull the values directly from the density function:
x = rnorm(100)
d = density(x, from=-5, to = 5, n = 1000)
d$x
d$y
Alternatively, if you really want to write your own kernel density function, here's some code to get you started:
Set the points z and x range:
z = c(-2, -1, 2)
x = seq(-5, 5, 0.01)
Now we'll add the points to a graph
plot(0, 0, xlim=c(-5, 5), ylim=c(-0.02, 0.8),
pch=NA, ylab="", xlab="z")
for(i in 1:length(z)) {
points(z[i], 0, pch="X", col=2)
}
abline(h=0)
Put Normal density's around each point:
## Now we combine the kernels,
x_total = numeric(length(x))
for(i in 1:length(x_total)) {
for(j in 1:length(z)) {
x_total[i] = x_total[i] +
dnorm(x[i], z[j], sd=1)
}
}
and add the curves to the plot:
lines(x, x_total, col=4, lty=2)
Finally, calculate the complete estimate:
## Just as a histogram is the sum of the boxes,
## the kernel density estimate is just the sum of the bumps.
## All that's left to do, is ensure that the estimate has the
## correct area, i.e. in this case we divide by $n=3$:
plot(x, x_total/3,
xlim=c(-5, 5), ylim=c(-0.02, 0.8),
ylab="", xlab="z", type="l")
abline(h=0)
This corresponds to
density(z, adjust=1, bw=1)
The plots above give: