In JAGS, how does the stochastic node works? - r

...and what does the ~ sign mean compared to R in y[I] ~ dnorm(m[i],tau) vs y[I] <- dnorm(n,m[i],tau)?
Consider the two lines of code:
`for(I in 1:length(y)) {
y[i] ~ dnorm(m[i],tau) #---> Jags code (stochastic node)
m[i] = alpha + beta*(x[i] - x_bar)
.
.
}
y[i] <- dnorm(n,m[i],tau)?) ---> R`
In Jags, what will be the n values since it is not specified inside the dnorm function? (dnorm(m[i],tau))
For each i, does the dnorm function calculate the density values for each y value with respect to the mean m[I] which has a linear relationship determined by the deterministic node and tau(precision)?
In short, I wanna know what n values will be used by dnorm or any other density function for distributions(dgamma or dbeta).

In this specific instance y is your response variable, m is your linear predictor, and tau is precision (the inverse of variance). Using ~ makes the relationship stochastic. Looking to the JAGS user manual...
"Relations can be of two types. A stochastic relation (~) defines a stochastic node, representing a random variable in the model. A deterministic relation (<-) defines a deterministic
node, the value of which is determined exactly by the values of its parents. The equals sign
(=) can be used for a deterministic relation in place of the left arrow (<-)."
So, in other words, you are assuming that the values in y are drawn from a normal distribution that are related to m and tau.
While dnorm in R calculates the density, JAGS calculates the log density (as per the user manual). Effectively, this stochastic relationship allows you to use y and x to estimate alpha, beta, and tau, and you use dnorm in this case by making a distributional assumption about the data generating process.
Of course, as this is a Bayesian analysis, you'll need priors for your parameters. You can also deterministically calculate the standard deviation instead of precision. A full model would look something like...
model{
# likelihood
for(I in 1:length(y)) {
y[i] ~ dnorm(m[i],tau)
m[i] <- alpha + beta*(x[i] - x_bar)
}
# priors
tau ~ dgamma(0.001, 0.001)
sd <- 1/ sqrt(tau)
alpha ~ dnorm(0, 0.001)
beta ~ dnorm(0, 0.001)
}

Related

Variance components of tensor interactions in R::mgcv

Why does the mgcv::gam.vcomp show two variance components for interactions made with mgcv::ti?¨
I can't seem to find the explanation or the between-the-lines explanation anywhere. Is is perhaps variance attributed to each components in the interaction?
require(mgcv)
test1 <- function(x,z,sx=0.3,sz=0.4) {
x <- x*20
(pi**sx*sz)*(1.2*exp(-(x-0.2)^2/sx^2-(z-0.3)^2/sz^2)+
0.8*exp(-(x-0.7)^2/sx^2-(z-0.8)^2/sz^2))
}
n <- 500
old.par <- par(mfrow=c(2,2))
x <- runif(n)/20;z <- runif(n);
xs <- seq(0,1,length=30)/20;zs <- seq(0,1,length=30)
pr <- data.frame(x=rep(xs,30),z=rep(zs,rep(30,30)))
truth <- matrix(test1(pr$x,pr$z),30,30)
f <- test1(x,z)
y <- f + rnorm(n)*0.2
b3 <- gam(y~ ti(x) + ti(z) + ti(x,z))
b3s <- gam(y~ ti(x) + ti(z) + s(x,z)) # describing the itneraction with s().
I know we're sort of mixing apples and oranges here.
gam.vcomp(b3)
ti(x) ti(z) ti(x,z)1 ti(x,z)2
0.06609731 0.01476070 0.08834218 0.05700322
gam.vcomp(b3s)
ti(x) ti(z) s(x,z)
0.1623056 2.4870344 7.7484987
You'll see the same behaviour with te(x, z)
> b <- gam(y ~ te(x,z))
> gam.vcomp(b)
te(x,z)1 te(x,z)2
0.08668107 0.04596708
and arises because tensor product smooths are defined by two, in this case, marginal bases, each of which have a smoothness parameter. Hence there are two variance components, one per smoothness parameter/marginal basis.
ti(x,z)1 is the variance component for the marginal basis of x,
ti(x,z)2 is the variance component for the marginal basis of z.
As these tensor product interactions smooth have had the main effects removed from them, physical interpretation is complicated, but in a practical sense, the values are the variance component interpretation of the smoothness parameters of the marginal bases.
The reason s(x, z) has just one variance component is that it is a 2-d thinplate spline basis. This basis is isotropic; there is the same smoothness in the to dimensions and hence a single smoothness parameter is required for the basis. Hence there is a single variance component.

Exponentiation of vector in JAGS model, Bayesian analysis in R

I have the following JAGS model for use in a Bayesian model in R. I am trying to estimate the posterior distribution for my variable "R". All variables but R are supposed to be deterministic nodes. Each variable, s_A, z_A, z_W, and d are vectors. While tau_s is a data.frame. TTD_aquifer and O2s_all are therefore expected to be a vector for each i.
model {
for (i in 1:N){
y[i] ~ dnorm(mu[i], tau)
mu[i] <- sum(O2s_all)/2
tau_s_bar[i] = (s_A[i]*z_A[i])/R[i]*log(z_A[i]/(z_A[i]-z_W[i]))
TTD_aquifer <- t((d[i]*sqrt(tau_s_bar[i]))/sqrt(4*3.14*d[i]*t(tau_s[,i]^3))*exp(-1*((d[i]*tau_s_bar[i])/(4*t(tau_s[,i])))*
(1-t(tau_s[,i])/tau_s_bar[i])^2))
O2s_all <- t(O2_o[i]-k_o[i]*t(tau_s[,i]))*TTD_aquifer
# prior on R
R[i] ~ dlnorm(-2, 1/(0.6)^2)
}
# prior on tau and sigma
tau <- pow(sigma, -2)
sigma ~ dunif(0, 100)
}
When I run this in jags.model() I get the following error: RUNTIME ERROR:
Invalid vector argument to exp. So it looks like I cannot input a vector into exp() like you can in R. The equations for TTD_aquifer and O2s_all run fine in R for a deterministic example. How should I write my equation for TTD_aquifer in JAGS to avoid the exp issue?
In JAGS, inverse link functions like exp only take scalar arguments. You could change your model to this in order to use exp. Note that you will need to include an object in your data list that denotes how many rows are in the tau_s data frame. Since I do not know what your model is doing I have not checked to determine if your parentheses are in the correct location across all of your divisions and multiplications.
model {
for (i in 1:N){
y[i] ~ dnorm(mu[i], tau)
mu[i] <- sum(O2s_all[,i])/2
tau_s_bar[i] <- (s_A[i]*z_A[i])/R[i]*log(z_A[i]/(z_A[i]-z_W[i]))
for(j in 1:K){ # K = nrow of tau_s
TTD_aquifer[j,i] <- t((d[i]*sqrt(tau_s_bar[i]))/
sqrt(4*3.14*d[i]*t(tau_s[j,i]^3))*
exp(-1*((d[i]*tau_s_bar[i])/(4*t(tau_s[j,i])))*
(1-t(tau_s[j,i])/tau_s_bar[i])^2))
O2s_all[j,i] <- t(O2_o[i]-k_o[i]*t(tau_s[j,i]))*TTD_aquifer[j,i]
} # close K loop
# prior on R
R[i] ~ dlnorm(-2, 1/(0.6)^2)
}
# prior on tau and sigma
tau <- pow(sigma, -2)
sigma ~ dunif(0, 100)
}
As TTD_aquifer and 02s_all should be a vector for each i, they should then be a two dimensional matrix of the same size as tau_s for each step in an MCMC chain. If you have a big dataset (i.e., big N and K) and are running this model for many iterations, tracking those derived parameters will take up considerable memory. Just something to keep in mind if you are running this on a computer without sufficient RAM. Thinning the chain is one way to help deal with the computational intensity of tracking said parameters.

Specify a discrete weibull distribution in JAGS or BUGS for R

I am fitting a weibull model to discrete values using JAGS in R. I have no problem fitting a weibull to continuous data, but I run in to trouble when I switch to discrete values.
Here is some data, and code to fit a weibull model in JAGS:
#draw data from a weibull distribution
y <- rweibull(200, shape = 1, scale = 0.9)
#y <- round(y)
#load jags, specify a jags model.
library(runjags)
j.model ="
model{
for (i in 1:N){
y[i] ~ dweib(shape[i], scale[i])
shape[i] <- b1
scale[i] <- b2
}
#priors
b1 ~ dnorm(0, .0001) I(0, )
b2 ~ dnorm(0, .0001) I(0, )
}
"
#load data as list
data <- list(y=y, N = length(y))
#run jags model.
jags.out <- run.jags(j.model,
data=data,
n.chains=3,
monitor=c('b1','b2')
)
summary(jags.out)
This model fits fine. However, if I transform y values to discrete values using y <- round(y), and run the same model, it fails with the error Error in node y[7], Node inconsistent with parents. The particular number of the node changes every time I try, but its always a low number.
I know I can make this run by adding a very small number to all of my values, however, this does not account for the fact that the data are discrete. I know discrete weibull distributions exists, but how can I implement one in JAGS?
You can use the 'ones trick' to implement a discrete weibull distribution in JAGS. Using the pmf here we can make a function to generate some data:
pmf_weib <- function(x, scale, shape){
exp(-(x/scale)^shape) - exp(-((x+1)/scale)^shape)
}
# probability of getting 0 through 200 with scale = 7 and shape = 4
probs <- pmf_weib(seq(0,200), 7, 4)
y <- sample(0:200, 100, TRUE, probs ) # sample from those probabilities
For the 'ones trick' to work you generally have to divide your new pmf by some large constant to ensure that the probability is between 0 and 1. While it appears that the pmf of the discrete weibull already ensures this, we have still added some large constant in the model anyways. So, here is what the model looks like now:
j.model ="
data{
C <- 10000
for(i in 1:N){
ones[i] <- 1
}
}
model{
for (i in 1:N){
discrete_weib[i] <- exp(-(y[i]/scale)^shape) - exp(-((y[i]+1)/scale)^shape)
ones[i] ~ dbern(discrete_weib[i]/C)
}
#priors
scale ~ dnorm(0, .0001) I(0, )
shape ~ dnorm(0, .0001) I(0, )
}
"
Note that we added 1) a vector of ones and a large constant in the data argument, 2) the pmf of the discrete weibull, and 3) we run that probability through a Bernoulli trial.
You can fit the model with the same code you have above, here is the summary which shows that the model successfully recovered the parameter values (scale = 7 and shape = 4).
Lower95 Median Upper95 Mean SD Mode MCerr MC%ofSD SSeff
scale 6.968277 7.289216 7.629413 7.290810 0.1695400 NA 0.001364831 0.8 15431
shape 3.843055 4.599420 5.357713 4.611583 0.3842862 NA 0.003124576 0.8 15126

Linear fit without slope in r

I want to fit a linear model with no slope and extract information of it. My objective is to know which is the best y-intercept for an horizontal line in a data set and also evaluate from derived linear fit to identify if y has a particular behavior (x is date). I've using range to evaluate behavior, but I'm looking for an index without unit.
Removing y-intercept:
X <- 1:10
Y <- 2:11
lm1 <- lm(Y~X + 0, data = data.frame(X=X,Y=Y)) # y-intercept remove opt 1
lm1 <- lm(Y~X - 1, data = data.frame(X=X,Y=Y)) # y-intercept remove opt 2
lm1 <- lm(Y~0 + X, data = data.frame(X=X,Y=Y)) # y-intercept remove opt 3
lm1$coefficients
X
1.142857
summary(lm1)$r.squared
[1] 0.9957567
All the lm showed before, has . But, if I evaluate:
lm2 <- lm(Y~1, data = data.frame(X=X,Y=Y))
lm2$coefficients
(Intercept)
6.5
summary(lm2)$r.squared
[1] 0
There is a way to calculate out of lm function or calculate an index to identify how much y is represented by an horizontal line?
Let lmObject be your linear model returned by lm (called with y = TRUE to return y).
If your model has intercept, then R-squared is computed as
with(lmObject, 1 - c(crossprod(residuals) / crossprod(y - mean(y))) )
If your model does not have an intercept, then R-squared is computed as
with(lmObject, 1 - c(crossprod(residuals) / crossprod(y)) )
Note, if your model is only an intercept (so it is certainly from the 1st case above), you have
residuals = y - mean(y)
thus R-squared is always 1 - 1 = 0.
In regression analysis, it is always recommended to include intercept in the model to get unbiased estimate. A model with intercept only is the NULL model. Any other model is compared with this NULL model for further analysis of variance.
A note. The value / quantity you want has nothing to do with regression. You can simply compute it as
c(crossprod(Y - mean(Y)) / crossprod(Y)) ## `Y` is your data
#[1] 0.1633663
Alternatively, use
(length(Y) - 1) * var(Y) / c(crossprod(Y))
#[1] 0.1633663

Calculating R^2 for a nonlinear least squares fit

Suppose I have x values, y values, and expected y values f (from some nonlinear best fit curve).
How can I compute R^2 in R? Note that this function is not a linear model, but a nonlinear least squares (nls) fit, so not an lm fit.
You just use the lm function to fit a linear model:
x = runif(100)
y = runif(100)
spam = summary(lm(x~y))
> spam$r.squared
[1] 0.0008532386
Note that the r squared is not defined for non-linear models, or at least very tricky, quote from R-help:
There is a good reason that an nls model fit in R does not provide
r-squared - r-squared doesn't make sense for a general nls model.
One way of thinking of r-squared is as a comparison of the residual
sum of squares for the fitted model to the residual sum of squares for
a trivial model that consists of a constant only. You cannot
guarantee that this is a comparison of nested models when dealing with
an nls model. If the models aren't nested this comparison is not
terribly meaningful.
So the answer is that you probably don't want to do this in the first
place.
If you want peer-reviewed evidence, see this article for example; it's not that you can't compute the R^2 value, it's just that it may not mean the same thing/have the same desirable properties as in the linear-model case.
Sounds like f are your predicted values. So the distance from them to the actual values devided by n * variance of y
so something like
1-sum((y-f)^2)/(length(y)*var(y))
should give you a quasi rsquared value, so long as your model is reasonably close to a linear model and n is pretty big.
As a direct answer to the question asked (rather than argue that R2/pseudo R2 aren't useful) the nagelkerke function in the rcompanion package will report various pseudo R2 values for nonlinear least square (nls) models as proposed by McFadden, Cox and Snell, and Nagelkerke, e.g.
require(nls)
data(BrendonSmall)
quadplat = function(x, a, b, clx) {
ifelse(x < clx, a + b * x + (-0.5*b/clx) * x * x,
a + b * clx + (-0.5*b/clx) * clx * clx)}
model = nls(Sodium ~ quadplat(Calories, a, b, clx),
data = BrendonSmall,
start = list(a = 519,
b = 0.359,
clx = 2304))
nullfunct = function(x, m){m}
null.model = nls(Sodium ~ nullfunct(Calories, m),
data = BrendonSmall,
start = list(m = 1346))
nagelkerke(model, null=null.model)
The soilphysics package also reports Efron's pseudo R2 and adjusted pseudo R2 value for nls models as 1 - RSS/TSS:
pred <- predict(model)
n <- length(pred)
res <- resid(model)
w <- weights(model)
if (is.null(w)) w <- rep(1, n)
rss <- sum(w * res ^ 2)
resp <- pred + res
center <- weighted.mean(resp, w)
r.df <- summary(model)$df[2]
int.df <- 1
tss <- sum(w * (resp - center)^2)
r.sq <- 1 - rss/tss
adj.r.sq <- 1 - (1 - r.sq) * (n - int.df) / r.df
out <- list(pseudo.R.squared = r.sq,
adj.R.squared = adj.r.sq)
which is also the pseudo R2 as calculated by the accuracy function in the rcompanion package. Basically, this R2 measures how much better your fit becomes compared to if you would just draw a flat horizontal line through them. This can make sense for nls models if your null model is one that allows for an intercept only model. Also for particular other nonlinear models it can make sense. E.g. for a scam model that uses stricly increasing splines (bs="mpi" in the spline term), the fitted model for the worst possible scenario (e.g. where your data was strictly decreasing) would be a flat line, and hence would result in an R2 of zero. Adjusted R2 then also penalize models with higher nrs of fitted parameters. Using the adjusted R2 value would already address a lot of the criticisms of the paper linked above, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2892436/ (besides if one swears by using information criteria to do model selection the question becomes which one to use - AIC, BIC, EBIC, AICc, QIC, etc).
Just using
r.sq <- max(cor(y,yfitted),0)^2
adj.r.sq <- 1 - (1 - r.sq) * (n - int.df) / r.df
I think would also make sense if you have normal Gaussian errors - i.e. the correlation between the observed and fitted y (clipped at zero, so that a negative relationship would imply zero predictive power) squared, and then adjusted for the nr of fitted parameters in the adjusted version. If y and yfitted go in the same direction this would be the R2 and adjusted R2 value as reported for a regular linear model. To me this would make perfect sense at least, so I don't agree with outright rejecting the usefulness of pseudo R2 values for nls models as the answer above seems to imply.
For non-normal error structures (e.g. if you were using a GAM with non-normal errors) the McFadden pseudo R2 is defined analogously as
1-residual deviance/null deviance
See here and here for some useful discussion.
Another quasi-R-squared for non-linear models is to square the correlation between the actual y-values and the predicted y-values. For linear models this is the regular R-squared.
As an alternative to this problem I used at several times the following procedure:
compute a fit on data with the nls function
using the resulting model make predictions
Trace (plot...) the data against the values predicted by the model (if the model is good, points should be near the bissectrix).
Compute the R2 of the linear régression.
Best wishes to all. Patrick.
With the modelr package
modelr::rsquare(nls_model, data)
nls_model <- nls(mpg ~ a / wt + b, data = mtcars, start = list(a = 40, b = 4))
modelr::rsquare(nls_model, mtcars)
# 0.794
This gives essentially the same result as the longer way described by Tom from the rcompanion resource.
Longer way with nagelkerke function
nullfunct <- function(x, m){m}
null_model <- nls(mpg ~ nullfunct(wt, m),
data = mtcars,
start = list(m = mean(mtcars$mpg)))
nagelkerke(nls_model, null_model)[2]
# 0.794 or 0.796
Lastly, using predicted values
lm(mpg ~ predict(nls_model), data = mtcars) %>% broom::glance()
# 0.795
Like they say, it's only an approximation.

Resources