I'm using jags to model engineering inverse problems from a Bayesian framework.
I would like to know if I can include a function to define the mu parameter in the jags model. For example
# Define the model:
modelString = "
model {
for ( i in 1:Ntotal ) {
myData[i] ~ dnorm(mu[i] ,1/sigma^2 )
mu[i]=function(c,fi){...}
}
c ~ dnorm( 9 , 1/9 )
fi ~ dnorm( 24 , 1/4 )
}
when I include the function I get an error: Error parsing model file:
syntax error on line 6 near "{"
Is there some way to include a function inside the model?
Thanks
The short answer is that there is no way to define a new function directly in BUGS/JAGS in the way that you want, because BUGS is not a programming language. You are limited to using the functions and distributions listed in the JAGS user manual, or made available for use by loading external JAGS modules such as runjags or jags-wiener or (currently a small number of) others.
The slightly longer version is that you can define your own functions and distributions in JAGS by writing your own module to specify your desired function/distribution in C++ and then loading that into JAGS. The official JAGS documentation is currently light on details, but there is a tutorial published:
Wabersich, D., and J. Vandekerckhove. 2014. Extending JAGS: a tutorial on adding custom distributions to JAGS (with a diffusion model example).. Behav. Res. Methods 46:15–28. doi:10.3758/s13428-013-0369-3.
This obviously requires familiarity with C++ but it is not that difficult if you are already a C++ coder. Installing the module is much easier if you embed the JAGS extension module within an R package, like the runjags package does (look in the /src directory). If you are not already a C++ coder then best to seek assistance.
Hope that helps,
Matt
——-
Edit: it is also worth saying that there is probably a way of doing what you want in BUGS/JAGS, it is just that what you wanted to implement (writing a function inside the JAGS model) is not a viable solution. If you explain your actual problem in more detail (probably in a new question) then you might get a solution that you had not considered.
Related
I'm working in R and trying to get started with neural networks, using the keras package.
I'd like to use a custom loss function for training my NN. It's possible to do this by writing a the custom loss function as lossFn <- function(y_true, y_pred) { ... } and passing it to the compile method as model %>% compile(loss = lossFn, ...).
Now in order to use the gradient descent method of training the NN, the loss function needs to be differentiable. I understand that you'd usually accomplish this by restricting yourself to using backend functions in your loss function, e.g.
lossFn <- function(y_true, y_pred) {
K <- backend()
K$mean(K$square(y_true - y_pred), axis = 1L)
}
or something like that.
Now, my problem is that I cannot express my loss function this way; I need to use functions that aren't available in the backend.
So my idea was that I'd work out the gradient myself on paper, and then provide it to compile as another argument, say compile(loss = lossFn, gradient = gradientFn, ...), with gradientFn suitably defined.
The documentation for keras (the R package!) does not indicate that this is possible. At the same time, it does not suggest it's not. And googling has turned up little that is relevant.
So my question is, is it possible?
An addendum: since Google has suggested that there are other training methods for NNs that do not rely on the gradient of the loss function, I should add I'm not too hung up on the specific training method. My ultimate goal isn't to manually supply the gradient of a custom loss function, it's to use a custom loss function to train the NN. The gradient is just a technical obstacle for me right now.
Thanks!
This is certainly possible in Keras, you'll just have to move up the stack a little and implement a train_step method and then call optimizer$apply_gradients().
Chapter 7 in the Deep Learning with R book covers this use case:
https://github.com/t-kalinowski/deep-learning-with-R-2nd-edition-code/blob/9f8b6d08dbb8d6565e4f5396e509aaea3e242b84/ch07.R#L608
Also, this keras guide may be useful, even though it's in Python and you're working in R. (The Python interface is very similar to the R interface).
https://keras.io/guides/writing_a_training_loop_from_scratch/
I have been implementing some negative binomial hurdle models in the R package glmmTMB and have come across something perplexing about the truncated negative binomial family.
In examining the source for that family argument I have found:
truncated_nbinom2 <- function(link="log") {
r <- list(family="truncated_nbinom2",
variance=function(mu,theta) {
stop("variance for truncated nbinom2 family not yet implemented")
})
return(make_family(r,link))
}
I am wondering if this family is still under development (as indicated by the stop command in the variance)?
It is documented as working in the vignette, and I am getting reasonable estimates from the models I have fit using this family (e.g. simulated data from the model seem sensible). I know many of the authors of the package are on this forum so I hoped someone might be able to clarify.
The truncated_nbinom2 family should work fine for most purposes. Looking through the glmmTMB source code (grep "\$variance" R/*.R) the $variance component of the family object is used only:
computing Pearson residuals
in creating objects to be used by the effects package
You may run into trouble somewhere else in the pipeline, if you're using downstream packages that need to use the expected variance of an object to compute something. But everything else should be fine.
PS I found an expression for this variance and created an issue to remind us to implement it: https://github.com/glmmTMB/glmmTMB/issues/606
PPS this is in the development version now (unfortunately, I'm pretty sure the paper I found only covers truncated NB2, so truncated NB1 may have to wait a while. However, the answer still applies - the absence of a variance function will only cause trouble in a few circumstances, and should never cause subtle trouble ...)
I'm working on fitting a generalized linear model in R (using glm()) for some data that has two predictors in full factorial. I'm confident that the gamma family is the right error distribution to use but not sure about which link function to use so I'd like to test all possible link functions against one another. Of course, I can do this manually by making a separate model for each link function and then compare deviances, but I imagine there is a R function that will do this and compile results. I have searched on CRAN, SO, Cross-validated, and the web - the closest function I found was clm2 but I do not believe I want a cumulative link model - based on my understanding of what clm's are.
My current model looks like this:
CO2_med_glm_alf_gamma <- glm(flux_median_mod_CO2~PercentH2OGrav+
I(PercentH2OGrav^2)+Min_Dist+
I(Min_Dist^2)+PercentH2OGrav*Min_Dist,
data = NC_alf_DF,
family=Gamma(link="inverse"))
How do I code this model into an R function that will do such a 'goodness-of-link' test?
(As far as the statistical validity of such a test goes, this discussion as well as a discussion with a stats post-doc lead me to believe that is valid to compare AIC or deviances between generalized linear models that are identical except for having different link functions)
This is not "all possible links", it's testing against a specified class of links, but there is a goodness-of-link test by Pregibon that is implemented in the LDdiag package. It's not on CRAN, but you can install it from the archives via
devtools::install_version("LDdiag","0.1")
The example given (not that exciting) is
quine$Days <- ifelse(quine$Days==0, 1, quine$Days)
ex <- glm(Days ~ ., family = Gamma(link="log"), data = quine)
pregibon(ex)
The pregibon family of link functions is implemented in the glmx package. As pointed out by Achim Zeleis in comments, the package provides various parametric link functions and supports general estimation and inference based on such parametric links (or more generally parametric families). To see a worked example how this can be employed for a variety of goodness-of-link assessements, see example("WECO", package = "glmx"). This replicates the analyses from two papers by Koenker and Yoon (see below).
This example might be useful too.
Koenker R (2006). “Parametric Links for Binary Response.” R News, 6(4), 32--34; link to page with supplementary materials.
Koenker R, Yoon J (2009). “Parametric Links for Binary Choice Models: A Fisherian-Bayesian Colloquy.” Journal of Econometrics, 152, 120--130; PDF.
I have learned that the dredge function (MuMIn package) can be used to perform goodness-of-link tests on glms, lms, etc. More generally it is a model selection function but allows for a good deal of customization. In this case, you can use the varying option to compare models fit with different link functions. See the Beetle example that they work for details.
I have a question pertaining to the R presentation by José A. Sánchez-Espigares and Jordi Ocaña entitled "An R implementation of bootstrap procedures for mixed models".
See: http://www.r-project.org/conferences/useR-2009/slides/SanchezEspigares+Ocana.pdf
On slide 22 on the examples they use the function bootstrap() (in the slide it is: sleep.boot=bootstrap(model,B=1000) ).
The only package they reference is lme4 but that package does not contain the bootstrap() function and I get:
Error: could not find function "bootstrap"
Does anybody know what package they are using here?
I am 99% sure that lme4 never had a bootstrap() function, and that it in fact came from code written by the presenters (I have looked for that code on-line before but have never found it). The built-in bootMer function tries to do most of what this presentation describes, but doesn't do everything (e.g. it doesn't include an implementation of a Wild bootstrap).
If you want to use bootMer without specifying a function FUN, you might take a look at confint(.,method="boot") ... it looks like the summary function that the authors used might have been something like
function(x) c(fixef(x),getME(x,"theta"),sigmaREML=sigma(x))
I am attempting to find a reference which explains how one computes standard errors for local polynomial regression? Specifically, in R one can use the loess function to get a model object and then use the predict function to retrieve standard errors. Is there a reference somewhere to what is actually happening? What about in the case when there may be serial correlation in the residuals, one must adjust this using Newey-West type methods, is there a way to use the sandwich package to do this as you would for a regular OLS using lm?
I tried looking at the source but the standard error computation calls a C function.
The "Source" section of ?loess tells you that the underlying C-code comes from the cloess package of Cleveland et al., and points you to its web home:
Source:
The 1998 version of ‘cloess’ package of Cleveland, Grosse and
Shyu. A later version is available as ‘dloess’ at http://www.netlib.org/a>.
Going there, you will find a link to a 50 page document (warning: postscript doc) that should tell you everything you need to know about this implementation of loess. In Cleveland's words:
This guide describes crucial steps in the proper analysis of data using
loess. Please read it.
Of particular interest will be the first couple pages of "Section 4: Statistical and Computational Methods".