R-INLA model hyperparameters have to.theta and from.theta functions that appear to be for converting between different parameterisations. It would be convenient to use those conversion functions but how does one do so?
Example with ar1
From the ar1 documentation (http://www.math.ntnu.no/inla/r-inla.org/doc/latent/ar1.pdf):
The parameter rho is represented as theta_2 = log((1 + rho)/(1 - rho))
and further down under hyper, theta2 we have to.theta 'function(x) log((1+x)/(1-x))'. It would be nice if we could use that to convert between rho and theta_2.
Let's try using an example
library(INLA)
# Example from ar1 documentation (http://www.math.ntnu.no/inla/r-inla.org/doc/latent/ar1.pdf)
#simulate data
n = 100
rho = 0.8
prec = 10
## note that the marginal precision would be
marg.prec = prec * (1-rho^2)
E=sample(c(5,4,10,12),size=n,replace=T)
eta = as.vector(arima.sim(list(order = c(1,0,0), ar = rho), n = n,sd=sqrt(1/prec)))
y=rpois(n,E*exp(eta))
data = list(y=y, z=1:n, E=E)
## fit the model
formula = y~f(z,model="ar1")
result = inla(formula,family="poisson", data = data, E=E)
That runs fine.
Can we use to.theta like this?
formula.to.theta = y~f(z,model="ar1",
hyper = list(rho = list(initial = to.theta(0.25))))
result = inla(formula.to.theta,family="poisson", data = data, E=E)
# Error in to.theta(0.25) : could not find function "to.theta"
So we can't use it like that. Is there another way to specify formula.to.theta that would work?
Pretty sure the answer to your question is "no". The Documentation is saying, not that there are functions by those names in the package, but rather that the hyper hyperparameter element will have functions by those names with values as given in the documentation. There is no reason to think that pasting those names after formula. would result in a meaningful function. Here is how to examine the value of from.theta in the environment of a specific call to the f-function:
library(INLA)
eval( f(z, model = "ar1") )$hyper$theta3$from.theta
===== result ========
function (x)
x
<environment: 0x7fdda6214040>
attr(,"inla.read.only")
[1] TRUE
The result from f( , "ar1") actually has three theta's each with a to and from function. You may be trying to change the hyper$thetax$param value which does not have an attr(,"inla.read.only") value of TRUE.
It would probably be more informative for you to execute this:
eval( f(z, model = "ar1") )$hyper
Related
I am trying to de-clutter some scripts by creating functions to complete repetitive tasks in R. One task I complete repeatedly is fitting a linear model to a set of data and creating predictions from that linear model fit. The data I am working with is concentration and flow data from streams, and flow is always the explanatory variable but the response variable changes and therefore I would like to include it as a function input. However, I receive a "non-numeric argument to mathematical function" error when I run the function. I have tried both with and without quotes since the lm() call does not require quotes but that results in the classis "object 'myobject' not found". Here's a simple example.
Update
flows <- seq(0,7,0.01)
dat <- tibble(flow=sample(flows,30),
parameter1_conc=rnorm(30,15,4),
parameter2_conc=rnorm(30,50,8))
regr_func <- function(modeldata,parameter,pred_maxflow,pred_flowint) {
mod <- lm(as.formula(paste('log(', parameter, ') ~ log(', flow, ')')), data=modeldata)
newflow <- data.frame(flow = seq(0, pred_maxflow, pred_flowint))
preds <<- predict(mod, newdata = newflow,
interval = 'prediction')
}
regr_func(modeldata = dat,
parameter = 'parameter1_conc',
pred_maxflow = 20,
pred_flowint = 0.001)
Original Example Error
flows <- seq(0,7,0.01)
dat <- tibble(flow=sample(flows,30),
parameter1_conc=rnorm(30,15,4),
parameter1_conc=rnorm(30,50,8))
regr_func <- function(modeldata,parameter,pred_maxflow,pred_flowint) {
mod <- lm(log(parameter)~log(flow), data = modeldata)
newflow <- data.frame(flow = seq(0, maxflow, flowint))
preds <<- predict(mod, newdata = newflow,
interval = 'prediction')
}
regr_func(modeldata = dat,
parameter = 'parameter1_conc',
pred_maxflow = 20,
pred_flowint = 0.001)
There are 3 issues here. The main one is that log(parameter) in your lm formula does not get substituted for the variable passed in as parameter. That means lm is literally looking for a column called parameter in your data, which doesn't exist. You can fix this by creating a formula with the name substituted in. Although doing this with strings is the most commonly used method to do this, it is a bit more efficient and safer to use substitute. This also allows you to pass your column name without quotes.
The second issue is that the arguments maxflow and flowint should probably be pred_maxflow and pred_flowint to match your function parameters.
Thirdly, using the <<- operator to write to a variable in the calling frame is bad practice. R users expect functions not to have such side effects, and know to store the output of function calls to variables under their control. Only in very rare circumstances should this be done within the function.
Putting all this together, we have:
regr_func <- function(modeldata, parameter, pred_maxflow, pred_flowint) {
f <- `[[<-`(x ~ log(flow), 2, substitute(log(parameter)))
mod <- lm(f, data = modeldata)
newflow <- data.frame(flow = seq(0, pred_maxflow, pred_flowint))
predict(mod, newdata = newflow, interval = 'prediction')
}
And we would call the function like this:
preds <- regr_func(modeldata = dat,
parameter = parameter1_conc,
pred_maxflow = 20,
pred_flowint = 0.001)
resulting in:
head(preds)
#> fit lwr upr
#> 1 Inf NaN NaN
#> 2 3.365491 2.188942 4.542041
#> 3 3.312636 2.219223 4.406049
#> 4 3.281717 2.236294 4.327140
#> 5 3.259780 2.248073 4.271488
#> 6 3.242765 2.256998 4.228531
Created on 2022-06-03 by the reprex package (v2.0.1)
I am modelling a time series as a GARCH(1,1)-process:
And the z_t are t-distributed.
In R, I do this in the fGarch-package via
model <- garchFit(formula = ~garch(1,1), cond.dist = "std", data=r)
Is this correct?
Now, I would like to understand the output of this to check my formula.
Obviously, model#fit$coefs gives me the coefficients and model#fitted gives me the fitted r_t.
But how do I get the fitted sigma_t and z_t?
I believe that the best way is to define extractor functions when generics are not available and methods when generics already exist.
The first two functions extract the values of interest from the fitted objects.
get_sigma_t <- function(x, ...){
x#sigma.t
}
get_z_t <- function(x, ...){
x#fit$series$z
}
Here a logLik method for objects of class "fGARCH" is defined.
logLik.fGARCH <- function(x, ...){
x#fit$value
}
Now use the functions, including the method. The data comes from the first example in help("garchFit").
N <- 200
r <- as.vector(garchSim(garchSpec(rseed = 1985), n = N)[,1])
model <- garchFit(~ garch(1, 1), data = r, trace = FALSE)
get_sigma_t(model) # output not shown
get_z_t(model) # output not shown
logLik(model)
#LogLikelihood
# -861.9494
Note also that methods coef and fitted exist, there is no need for model#fitted or model#fit$coefs, like is written in the question.
fitted(model) # much simpler
coef(model)
# mu omega alpha1 beta1
#3.541769e-05 1.081941e-06 8.885493e-02 8.120038e-01
It is a list structure. Can find the structure with
str(model)
From the structure, it is easier to extract with $ or #
model#fit$series$z
model#sigma.t
Extract the baseline hazard function h0(t) from glmnet object
I want to know the hazard function at time t >> h(t,X) = h0(t) exp[Σ βi*Xi]. How can I extract the baseline hazard function h0(t) from glmnet object in R?
What I know is that function "basehaz()" in Survival Packages can extract the baseline hazard function from coxph object only.
I also found a function, glmnet.basesurv(time, event, lp, times.eval = NULL, centered = FALSE). But when I try to use this function, there is an error.
Error: could not find function "glmnet.basesurv"
Below is my code, using glmnet to fit the cox model and obtained the coefficients of selected variables. Is it possible to get the baseline hazard function h0(t) from this glmnet object?
Code
# Split data into training data and testing data
set.seed(101)
train_ratio = 2/3
sample <- sample.int(nrow(x), floor(train_ratio*nrow(x)), replace = F)
x.train <- x[sample, ]
x.test <- x[-sample, ]
y.train <- y[sample, ]
y.test <- y[-sample, ]
surv_obj <- Surv(y.train[,1],y.train[,2])
#
my_alpha = 0.5
fit = glmnet(x = x.train, y = surv_obj, family = "cox",alpha = my_alpha) # fit the model with elastic net method
plot(fit,xvar="lambda", main="cox model coefficient paths(glmnet.fit)\n\n") # Plot the paths for the fit
fit
# cross validation to find out best lambda
cv_fit = cv.glmnet(x = x.train,y = surv_obj , family = "cox",nfolds = 10,alpha = my_alpha)
tencrossfit <- cv_fit$glmnet.fit
plot(cv_fit, main="Cross-validated Deviance(10 folds cv.glmnet.fit)\n\n")
plot(tencrossfit, main="cox model coefficient paths(10 folds cv.glmnet.fit)\n\n")
max(cv_fit$cvm)
summary(cv_fit$cvm)
cv_fit$lambda.min
cv_fit$lambda.1se
coef.min = coef(cv_fit, s = "lambda.1se")
pred_min_value2 <- predict(cv_fit, s=cv_fit$lambda.min, newx=x.test,type="link")
I really appreciate any help you can provide.
The glmnet.basesurv function is part of the hdnom package (which is available on CRAN), not glmnet itself. So install that, and then call it.
I had similar question and after installing hdnom install.packages("hdnom"), if you check inside the function list library(help = "hdnom")
you can see that the function is actually glmnet_survcurve(). I made it working as hdnom:::glmnet_survcurve(), example is here:
S <- Surv(data$survtimed, data$outcome)
X_glm<-model.matrix(S~.,data[, c("factor1", "factor2")])
cox_model <- glmnet(X_glm, S, family="cox", alpha=1, lambda=0.2)
times = c (1,2) #for predict of survival and
linearpredictors at times = 1 and 2
predictions = hdnom:::glmnet_survcurve(cox_model, S[,1], S[,2], X_glm, survtime = times)
predictions$p[,1] #survival probability at time 1
I have a MA(1) model with known parameter and known .
I'd like to know, is there a function in R that can return for me?
I also tried to get by iteration:
However, in reality, is unknown and cannot be specified at the first place.
I'm having this question because I used gnls to estimate a nonlinear model with residuals being MA(1) process. The code is something like:
model = gnls(y ~ c + log( x1^g + x2^g), start = list(c = 0.04, g = 0.3),
correlation = corARMA(c(0.5), form = ~ 1, p = 0, q = 1, fixed = FALSE))
It returns every parameter estimation including . But residuals(model) returns instead of .
So any suggestions?
Thank you for the help in advance.
Yes. You can use Arima function available in R.
fit <- arima(ts(data), order=c(0,0,1))
as you do not want AR and I part. You can set it to zero.
summary(fit)
You can observe parameters learned and errors by summary function.
For more information, refer to : https://www.otexts.org/fpp/8/7
I'm using the nls.lm function from the minpack.lm package and something "weird" happens when I change the order of the parameters in the residual function
This code works :
install.packages('minpack.lm')
library(minpack.lm)
## values over which to simulate data
x <- seq(0,100,length=100)
## model based on a list of parameters
getPrediction <- function(parameters, x)
parameters$A*exp(-parameters$alpha*x) + parameters$B*exp(-parameters$beta*x)
## parameter values used to simulate data
pp <- list(A = 2, B = 0.8, alpha = 0.6, beta = 0.01)
## simulated data, with noise
simDNoisy <- getPrediction(pp,x) + rnorm(length(x),sd=.01)
#simDNoisy[seq(1,10)] = rep(10,11)
simDNoisy[1] = 4
## plot data
plot(x,simDNoisy, main="data")
## residual function
residFun <- function(parameters, observed, xx)
sqrt(abs(observed - getPrediction(parameters, xx)))
## starting values for parameters
parStart <- list(Ar = 3, Br = 2, alphar = 1, betar = 0.05)
## perform fit
rm(nls.out)
nls.out <- nls.lm(par=parStart,
fn = residFun,
observed = simDNoisy,
xx = x,
control = nls.lm.control(nprint=1))
nls.out
It doesn't work if I replace the residual function by this (just change parameters order)
residFun <- function(xx, parameters, observed )
sqrt(abs(observed - getPrediction(xx, parameters)))
Error in parameters$A : $ operator is invalid for atomic vectors
Why does it cause this error ?
Parameters should match the order of the parameters as defined in the function. The only exception you should use is if you explicitly name them out of order. Consider this example of what the function thinks are two parameters
theParameters=function(X,Y){
print(paste("I think X is",X))
print(paste("I think Y is",Y))
}
theParameters(X=2,Y=10)
theParameters(Y=10,X=2)
#you can change the parameter order if you identify them with parameter=...
#but if you don't, it assumes it's in the order of how the function is defined.
# which of these is X and which is Y?
theParameters(10,2)
It's preferable to always identify the parameters, but nececessary if it's out of order. (Other languages don't even let you change the order of parameters when you call them).
getPrediction(xx=xx,parameters=parameters)
In this case the reason is the function treats xx and parameters as if it had created its own local copy. Without the identification this line
getPrediction(xx,parameters)
means this to R
getPrediction(parameters=xx,xx=parameters)
because that matches the original signature of the function.
So the function's version of parameters is what you pass in as xx, and so on.
Because you called the parameters the same thing as variable names, it can be confusing. It works easier if you vary the dummy version of variable names slightly. Alternatively if scopes of variables allow it, you don't even have to pass in the parameters, but be careful with that practice because it can cause tracing headaches.