Bayesian simple linear regression Gibbs Sampling with gamma prior - r

Please help me out.
I am doing Metopolis_hasting within Gibbs to generate a Markov Chian with stationary distribution equal to the joint conditional distribution of (beta,phi) given observed y. Where the model for y is simple linear regression and phi is 1/sigma^2. The full conditional distribution for phi is gamma(shape=shape_0+n/2,rate=rate_0 + 0.5*sum((y$y-b[1]-b[1]*y$x)^2)) where shape_0 and rate_0 are prior distribution of phi (which follows a gamma)
Here is my code:
y <- read.table("...",header = T)
n <- 50
shape_0 <- 10
rate_0 <- 25
shape <- shape_0+n/2
mcmc <- function (n = 10){
X <- matrix(0,n,3)
b <- c(5,2)
phi <- 0.2
X[1,] <- c(b,phi)
count1 <- 0
count2 <- 0
for (i in 2:n){
phi_new <- rnorm(1,phi,1) #generate new phi candidate
rate <- rate_0 + 0.5*sum((y$y-b[1]-b[1]*y$x)^2)
prob1 <- min(dgamma(phi_new,shape = shape,
rate = rate)/dgamma(phi,shape = shape, rate = rate),1)
##here is where I run into trouble, dgamma(phi_new,shape = shape,
##rate = rate)
##and dgamma(phi,shape = shape, rate = rate) both gives 0
u <- runif(1)
if (prob1>u)
{X[i,3] <- phi_new; count1=count1+1}
else {X[i,3] <-phi}
phi <- X[i,3]
....}
I know I should use log transformation on the precision parameter, but I'm not exactly sure how to do it. log(dgamma(phi_new,shape = shape, rate = rate)) would return -inf.
Thank you so much for help.

Related

Estimating Bias in R

Write a simulation experiment to estimate the bias of the estimator λˆ= 1/ X¯ by sampling
using x=rexp(n,rate=5) and recording the values of 1/mean(x). You should find that the
bias is λ/n−1. Here we’ve used λ = 5 but the result will hold for any λ.
Here is my solution ( I dont get λ/n−1). Am I doing something wrong here?
set.seed(1)
lambda <- 5
x <- rexp(n= 1e5, rate = lambda )
samp.mean <- mean(x)
lam.est <- 1/samp.mean
lam.est ##4.986549
bias <- abs(lambda - lam.est)
bias ##0.01345146
To start with, there is a mistake in your formula. The bias of the lambda estimator is not lambda/n-1 but lambda/(n-1)!
Then note that in order to carry out this experiment correctly, it is not enough to calculate the estimated estimator once.
Do the experiment "n" times on the vector of size "nx".
lambda = 3
nx = 150
n = 1e5
set.seed(1)
out = vector("numeric", n)
for(i in 1:n){
out[i] = 1/mean(rexp(n= nx, rate = lambda))
}
lambda/(nx-1)
mean(out)
bias = abs((mean(out)-lambda))
As you can see for lambda = 3 and nx = 150 the expression lambda/(nx-1) is 0.02013423. And your estimated lambda is 3.019485.
lambda = 5
nx = 200
n = 1e5
set.seed(1)
out = vector("numeric", n)
for(i in 1:n){
out[i] = 1/mean(rexp(n= nx, rate = lambda))
}
lambda/(nx-1)
mean(out)
bias = abs((mean(out)-lambda))
However, for lambda = 5 and nx = 200, the expression lambda/(nx-1) is 0.02512563. And your estimated lambda is 5.024315.
Perform this experiment for other values of lambda and nx and you will find that the bias of this estimator is lambda/(n-1).

Running rJAGS when the likelihood is a custom density

I am trying to figure out how to sample from a custom density in rJAGS but am running into issues. having searched the site, I saw that there is a zeroes (or ones) trick that can be employed based on BUGS code but am having a hard time with its implementation in rJAGS. I think I am doing it correctly but keep getting the following error:
Error in jags.model(model1.spec, data = list(x = x, N = N), n.chains = 4, :
Error in node dpois(lambda)
Length mismatch in Node::setValue
Here is my rJAGS code for reproducibility:
library(rjags)
set.seed(4)
N = 100
x = rexp(N, 3)
L = quantile(x, prob = 1) # Censoring point
censor = ifelse(x <= L, 1, 0) # Censoring indicator
x[censor == 1] <- L
model1.string <-"
model {
for (i in 1:N){
x[i] ~ dpois(lambda)
lambda <- -N*log(1-exp(-(1/mu)))
}
mu ~ dlnorm(mup, taup)
mup <- log(.0001)
taup <- 1/49
R <- 1 - exp(-(1/mu) * .0001)
}
"
model1.spec<-textConnection(model1.string)
jags <- jags.model(model1.spec,
data = list('x' = x,
'N' = N),
n.chains=4,
n.adapt=100)
Here, my negative log likelihood of the density I am interested in is -N*log(1-exp(-(1/mu))). Is there an obvious mistake in the code?
Using the zeros trick, the variable on the left-hand side of the dpois() relationship has to be an N-length vector of zeros. The variable x should show up in the likelihood somewhere. Here is an example using the normal distribution.
set.seed(519)
N <- 100
x <- rnorm(100, mean=3)
z <- rep(0, N)
C <- 10
pi <- pi
model1.string <-"
model {
for (i in 1:N){
lambda[i] <- pow(2*pi*sig2, -0.5) * exp(-.5*pow(x[i]-mu, 2)/sig2)
loglam[i] <- log(lambda[i]) + C
z[i] ~ dpois(loglam[i])
}
mu ~ dnorm(0,.1)
tau ~ dgamma(1,.1)
sig2 <- pow(tau, -1)
sumLL <- sum(log(lambda[]))
}
"
model1.spec<-textConnection(model1.string)
set.seed(519)
jags <- jags.model(model1.spec,
data = list('x' = x,
'z' = z,
'N' = N,
'C' = C,
'pi' = pi),
inits = function()list(tau = 1, mu = 3),
n.chains=4,
n.adapt=100)
samps1 <- coda.samples(jags, c("mu", "sig2"), n.iter=1000)
summary(samps1)
Iterations = 101:1100
Thinning interval = 1
Number of chains = 4
Sample size per chain = 1000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
mu 4.493 2.1566 0.034100 0.1821
sig2 1.490 0.5635 0.008909 0.1144
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
mu 0.6709 3.541 5.218 5.993 7.197
sig2 0.7909 0.999 1.357 1.850 2.779

How to draw Poisson density curve in R?

I need to show that the amount of events in Poisson process are distributed by Poisson distribution with parameter lambda * t.
Here is the Poisson process generator:
ppGen <- function(lambda, maxTime){
taos <- taosGen(lambda, maxTime)
pp <- NULL
for(i in 1:maxTime){
pp[i] <- sum(taos <= i)
}
return(pp)
}
Here I try to replicate the process 1000 times and vectorisee the total occurrences in each realisation:
d <- ppGen(0.5,100)
tail(d,n=1)
reps <- 1000
x1 <- replicate(reps, tail(ppGen(0.5,100), n=1))
hist(x1)
Here is the histogram:
Here I am trying to draw a theoretical Poisson density curve with parameter lambda * t:
xfit<-seq(1,100,length=100)
yfit<-dpois(xfit,lambda = 0.5*100)
lines(xfit,yfit)
But the curve doesn't appear anywhere near the histogram. Can anyone suggest on the right way to do this?
Maybe you can try curve like below
x <- rpois(1000, 0.5 * 100)
dp <- function(x, lbd = 0.5 * 100) dpois(x, lambda = lbd)
curve(dp, 0, 100)
hist(x, freq = FALSE, add = TRUE)

How to simulate data in R, such that p-value of regressor is exactly 0.05?

I have written a small function that simulates data from a normal distribution, how it is usual in linear models. My question is how to get a model with a pvalue of sim[, 1] == 0.05. I want to show that if I add a random variable even it is normal distributed around zero with small variance N(0,0.0023) , that pvalue of sim[,1] changes. The code below shows the true model.
set.seed(37) # seed for reproducability
simulation <- function(b_0, b_1,n,min_x_1 ,max_x_1,sd_e){
mat <- NA
x_1 <- runif(n = n, min = min_x_1, max =max_x_1)
error <- rnorm(mean = 0,sd = sd_e, n = n )
y <- b_0 + b_1*x_1 + error
mat <- matrix(cbind(x_1,y), ncol = 2)
return(mat)
#plot(mat[,1],mat[,2])
}
sim <- simulation(10,-2,10000,-10,70,0.003)
summary(lm(sim[,2] ~ sim[,1] ))

R - Fitting a constrained AutoRegression time series

I have a time-series which I need to fit onto an AR (auto-regression) model.
The AR model has the form:
x(t) = a0 + a1*x(t-1) + a2*x(t-2) + ... + aq*x(t-q) + noise.
I have two contraints:
Find the best AR fit when lag.max = 50.
Sum of all coefficients a0 + a1 + ... + aq = 1
I wrote the below code:
require(FitAR)
data(lynx) # my real data comes from the stock market.
z <- -log(lynx)
#find best model
step <- SelectModel(z, ARModel = "AR" ,lag.max = 50, Criterion = "AIC",Best=10)
summary(step) # display results
# fit the model and get coefficients
arfit <- ar(z,p=1, order.max=ceil(mean(step[,1])), aic=FALSE)
#check if sum of coefficients are 1
sum(arfit$ar)
[1] 0.5784978
My question is, how to add the constraint: sum of all coefficients = 1?
I looked at this question, but I do not realize how to use it.
**UPDATE**
I think I manage to solve my question as follow.
library(quadprog)
coeff <- arfit$ar
y <- 0
for (i in 1:length(coeff)) {
y <- y + coeff[i]*c(z[(i+1):length(z)],rep(0,i))
ifelse (i==1, X <- c(z[2:length(z)],0), X <- cbind(X,c(z[(i+1):length(z)],rep(0,i))))
}
Dmat <- t(X) %*% X
s <- solve.QP(Dmat , t(y) %*% X, matrix(1, nr=15, nc=1), 1, meq=1 )
s$solution
# The coefficients should sum up to 1
sum(s$solution)

Resources