I'm looking for a Bayesian parallel for nonlinear mixed effects models, specifically those using the nlme package in R.
I've come across blme but that seems to be only for linear mixed-effects models. Would brms be appropriate in this case? I've tried to write some code that's analogous to the nlme construction below with the function brm.
library(nlme)
model <- nlme(height ~ exp(beta1*age + 1),
data = Loblolly,
fixed = list(beta1 ~ 1),
random = list(Seed = pdDiag(list(beta1 ~ 1))),
start = list(fixed = c(beta1 = 3)))
library(brms)
bayesian_model <- brm(bf(height ~ exp(beta1*age + 1), beta1 ~ 1, nl = TRUE),
data = Loblolly,
prior = c(prior(normal(0, 1), nlpar = beta1)))
I was able to get to this point, but how exactly do I specify random effects for beta1? And how would I specify the diagonal variance structure like I have with random = list(Seed = pdDiag(list(beta1 ~ 1)))?
Related
I'm here re-raising the issue of predicting CI's for gamlss models using the newdata argument. A further complication is that I'm interested in partial effects as well.
A closely related issue (without partial effects) was un-resolved in 2018: Error when predicting new fitted values from R gamlss object.
I'm wondering if there has been updates that also extend to partial effects. The example below reproduces the error (notice the `type = "terms" specifying I'm interested in the effects of each model term)".
library(gamlss)
library(tidyverse)
#example data
test_df <- tibble(x = rnorm(1e4),
x2 = rnorm(n = 1e4),
y = x2^2 + rnorm(1e4, sd = 0.5))
#fitting gamlss model
gam_test = gamlss(formula = y ~ pb(x2) + x,
sigma.fo= y ~ pb(x2) + x,
data = test_df)
#data I want predictions for
pred_df <- tibble(x = seq(-0.5, 0.5, length.out = 300),
x2 = seq(-0.5, 0.5, length.out = 300))
#returns error when se.fit = TRRUE
pred <- predictAll(object = gam_test,
type = "terms",
se.fit = TRUE, #works if se.fit = FALSE
newdata = pred_df)
Many thanks in advance!
I talked to the main developer of the gamlss software (who is responsible for this function).
He says that the option se.fit=TRUE with type="terms"
has not yet been implemented,
and unfortunately he is too busy at present.
One idea is to bootstrap the original data,
and predict terms for each bootstrap sample,
and then use the results to obtain CI's.
I have a question about using interaction term in mgcv package with 2 linear predictor.
I wrote the code to fit interaction between x1 and x2,
mgcv::gam(y ~ x1 + x2 + ti(x1, x2, k = 3), method = "REML", data = DATA)
but I'm really not sure this is correct way to using ti function in mgcv.
Should I use different method (like glm?) to do this work?
Thank you for your answer.
If you want a non-linear interaction, you have two choices:
gam(y ~ te(x1, x2), method = "REML", data = DATA)
or
gam(y ~ s(x1) + s(x2) + ti(x1, x2), method = "REML", data = DATA)
In the first model, the main effects and interactions are bound up in the one two-dimensional function te(x1,x2).
In the second model, because ti(x1, x2) will exclude the main smooth effects or the marginal terms (x1, x2), you should also include the main
smooth effects of these terms too using s().
With your formulation, you would not get any non-linear main effects, only linear main effects and a non-linear pure-interaction, which is likely not what you want.
Here's an example:
library("mgcv")
library("gratia")
library("ggplot2")
library("patchwork")
set.seed(1)
df2 <- data_sim(2, n = 1000, dist = "normal", scale = 1, seed = 2)
m1 <- gam(y ~ s(x, k = 5) + s(z, k = 5) + ti(x, z, k = 5),
data = df2,
method = "REML")
m2 <- gam(y ~ x + z + ti(x, z, k = 5),
data = df2
method = "REML")
pl <- plot_layout(nrow = 1, ncol = 3)
p1 <- draw(m1) + pl
p2 <- draw(m2, parametric = TRUE) + pl
p1 - p2 + plot_layout(nrow = 2)
which produces
Notice how in this case, you would be missing the non-linear in the marginal smooths/terms, which is not accounted for in the ti() term, because it has no marginal main effects (the ti() smooths are the same across both models).
If you just want to fit a linear "main effects" and their linear interaction, just use the formula as you would with glm():
gam(y ~ x1 + x2 + x1:x2, ....)
Note that the term "linear predictor" refers to the entire model (in the case), or more specifically the entire formula on the RHS of ~.
I am using the following code to fit and test a random forest classification model:
> control <- trainControl(method='repeatedcv',
+ number=5,repeats = 3,
+ search='grid')
> tunegrid <- expand.grid(.mtry = (1:12))
> rf_gridsearch <- train(y = river$stat_bino,
+ x = river[,colnames(river) != "stat_bino"],
+ data = river,
+ method = 'rf',
+ metric = 'Accuracy',
+ ntree = 600,
+ importance = TRUE,
+ tuneGrid = tunegrid, trControl = control)
Note, I am using
train(y = river$stat_bino, x = river[,colnames(river) != "stat_bino"],...
rather than: train(stat_bino ~ .,...
so that my categorical variables will not be turned into dummy variables.
solution here: variable encoding in K-fold validation of random forest using package 'caret')
I would like to extract the FinalModel and use it to make partial dependence plots for my variables (using code below), but I get an error message and don't know how to fix it.
> model1 <- rf_gridsearch$finalModel
> library(pdp)
> partial(model1, pred.var = "MAXCL", type = "classification", which.class = "1", plot =TRUE)
Error in eval(stats::getCall(object)$data) :
..1 used in an incorrect context, no ... to look in
Thanks for any solutions here!
I want to estimate the spatial panel autoregressive model
y_{t} = a + \rho W y_{t} + \epsilon_{t}
where a is a vector of individual fixed effects. I am using the excellent splm package in R.
Note that I don't have any independent variables X here - if I include some regressors X there is no problem, but I wonder how to specify the model with splm in the absence of independent variables.
library(splm)
library("spdep")
data("Produc", package = "Ecdat")
data("usaww")
usalw <- mat2listw(usaww)
# this works well since I have independent regressors
spml(formula = log(gsp) ~ log(pcap), data = Produc,
listw = usaww, lag = TRUE, spatial.error = "none", model = "within",
effect = "twoways")
# this does not work
spml(formula = log(gsp) ~ ., data = Produc,
listw = usaww, lag = TRUE, spatial.error = "none",
model = "within", effect = "individual")
To estimate an "empty" model (intercept only) the formula has to be y ~ 1. This currently works with random or no individual effects, "within" (fixed effects) estimators need a fix.
A workaround for getting the FE estimates: explicit demeaning of the data
library(plm)
spml(formula = Within(log(gsp)) ~ 1, data = Produc,
listw = usaww, lag = TRUE, spatial.error = "none",
model = "pooling")
Are there any functions that can simulate the response variable from a robust linear model or a quantile regression model like there is for linear models (i.e. stats::simulate.lm)?
If not, is there a way to adapt code to do this for either model?
Here is an example of the kind of data and models I am dealing with:
#Data
df <- data.frame(Response = c(1:30 + rnorm(n = 30)), Covariate = c(seq(from = 10, to = 1.3, by = -0.3)))
#Robust linear regression
fit.rlm <- MASS::rlm(Response ~ Covariate, data = df)
#Quantile regression
fit.qr <- quantreg::rq(Response ~ Covariate, data = df, tau = c(0.025,0.5,0.975)