Related
I've read many similar questions but still couldn't find the answer.
Here is some data that I'm using to calibrate the equation below:
set.seed(100)
i <- sort(rexp(n = 100,rate = 0.01))
Tr <- sort(runif(n = 100,min = 5,max = 100))
k_start <- 3259
u_start <- 0.464
t0_start <- 38
n_start <- -1
i_test <- k_start*Tr^u_start * (5 + t0_start)^n_start
m <- nls(i~(k * Tr^u / (5+t0)^n), start = list(k = k_start, u = u_start,
t0 = t0_start, n = n_start))
When I used nlsLM and the same error came up:
Error in nlsModel(formula, mf, start, wts) : singular gradient matrix at initial parameter estimates
For the start values, I tried to use the values from calibration in Python and still the same error occurs.
There's also another way to use that equation that is like this:
However, the result is the same error.
d_start <- 43
m <- nls(i ~ (k * Tr^u / d),
start = list(k = k_start, u = u_start,d=d_start))
When I use only the numerator it works, but that's not what I need.
Any help will be very much appreciated.
In the first nls, the right hand side depends on k, t0 and n only through
k / (5+t0)^n so it is over parameterized as one parameter could represent
their combined effect. In the second nls the right hand side depends only
on k and d through k / d so again the problem has been over parameterized and
one parameter could represent their combined effect.
Getting rid of the excess parameters and getting the starting values using a linear model it converges.
fit.lm <- lm(log(i) ~ log(Tr))
co <- coef(fit.lm)
fit <- nls(i ~ k * Tr ^ u, start = list(k = exp(co[[1]]), u = co[[2]]))
fit
## Nonlinear regression model
## model: i ~ k * Tr^u
## data: parent.frame()
## k u
## 0.0002139 3.0941602
## residual sum-of-squares: 79402
##
## Number of iterations to convergence: 43
## Achieved convergence tolerance: 5.354e-06
Reciprocal Model
Below we fit a "reciprocal model" which has the same number of parameters but a better fit as measured by the deviance which is the residual sum of squares. A lower value means better fit.
# reciprocal model
fit.recip <- nls(i ~ 1/(a + b * log(Tr)), start = list(a = 1, b = 1))
deviance(fit)
## [1] 79402.17
deviance(fit.recip)
## [1] 25488.1
Graphics
Below we plot both fit (red) and fit.recip (blue) models.
plot(i ~ Tr)
lines(fitted(fit) ~ Tr, col = "red")
lines(fitted(fit.recip) ~ Tr, col = "blue")
legend("topleft", legend = c("fit", "fit.recip"), lty = 1, col = c("red", "blue"))
(continued after plot)
plinear
Note that the plinear algorithm could be used as an alternative algorithm to fit the fit model above to avoid having to supply a starting value for k. It also has the additional benefit that it requires substantially fewer iterations in this case (14 vs. 45). With plinear the formula should omit the linear argument, k, as it is implied by the algorithm and will be reported as .lin .
nls(i ~ Tr ^ u, start = list(u = co[[2]]), algorithm = "plinear")
## Nonlinear regression model
## model: i ~ Tr^u
## data: parent.frame()
## u .lin
## 3.0941725 0.0002139
## residual sum-of-squares: 79402
##
## Number of iterations to convergence: 14
## Achieved convergence tolerance: 3.848e-06
I have a data-set and I'm exploring potential exponential fits to the data using eye-balled estimations and estimations via maximum likelihood methods. I'm finding a huge discrepancy in one of the parameter fits and I'm not sure how to reconcile between what I expect vs the stats. I'm wondering if the dnorm in the ML function is appropriate or perhaps my plotting from the estimates isn't correct?
Data fit to exponential function type a + exp(-x * b) + c
y.size <- c(2.69,4.1,8.04,3.1,5.27,5.033333333,3.2,7.25,6.29,4.55,6.1,2.65,3.145,3.775,3.46,5.73,5.31,4.425,3.725,4.32,5,3.09,5.25,5.65,3.48,6.1,10,9.666666667,6.06,5.9,2.665,4.32,3.816666667,3.69,5.8,5,3.72,3.045,4.485,3.642857143,5.5,6.333333333,4.75,6,7.466666667,5.03,5.23,4.85,5.59,5.96,5.33,4.92,4.255555556,6.346666667,4.13,6.33,4,7.35,6.35,4.63,5.13,7.4,4.28,4.233333333,4.3125,6.18,4.3,4.47,4.88,4.5,2.96,2.1,3.7,3.62,5.42,3.8,5.5,3.27,3.36,3.266666667,2.265,3.1,2.51,2.51,4.4,2.64,4.38,4.53,2.29,2.87,3.395,3.26,2.77,3.22,4.31,4.73,4.05,3.48,4.8,4.7,3.05,4.21,5.95,4.39,4.55,4.27,4.955,4.65,3.32,3.48,3.828571429,4.69,4.68,3.76,3.91,4,4.41,4.19,4.733333333,4.32,2.83,3.41,4.42,3.47,3.84,4.39)
x.number <- c(69,62,8,80,13,12,2,22,19,49,840,44,31,56,33,58,91,8,15,86,11,69,12,24,32,27,1,4,26,4,28,33,1516,41,20,58,44,29,58,14,3,3,6,3,26,52,26,29,92,30,18,11,27,19,38,78,57,52,17,45,56,7,37,7,14,13,164,76,82,14,273,122,662,434,126,374,1017,522,374,602,164,5,191,243,134,70,23,130,306,516,414,236,172,164,92,53,50,17,22,27,92,48,30,55,28,296,35,12,350,17,22,53,97,62,92,272,242,170,37,220,452,270,392,314,150,232)
df <- df[df$totalinf < 750,]
a <- 9
b <- 0.07
c <- 3
ggplot(data=df,aes(x=x.number,y=y.size))+ geom_point()+ stat_function(fun=function(x)a*exp(-x*b) + c, color = "blue")
I use a ML estimator to initialize the function using the estimates
a = 9; b = 0.07; c = 3; sigma = 1
expreg <- function(a,b,c, sigma){
y.pred <- a * exp(-x.number*b) + c
ll <- -sum(dnorm(y.size, mean = y.pred, sd = sigma, log=TRUE ))
ll
}
mle2.expreg.model <- mle(expreg, start = list(a = 9, b = 0.07, c = 3 , sigma = 1))
summary(mle2.expreg.model)
Coefficients:
Estimate Std. Error
a 8.667305 974.16243176
b 12.671940 NaN
c 4.488451 0.12466034
sigma 1.382550 0.08814796
The b estimates are nearly 10x difference in order of magnitude. Needless to say, the curve looks quite flat with the new estimates
I search for one approach for comparing linear, non-linear and different parameterization non-linear models. For this:
#Packages
library(nls2)
library(minpack.lm)
# Data set - Diameter in function of Feature and Age
Feature<-sort(rep(c("A","B"),22))
Age<-c(60,72,88,96,27,
36,48,60,72,88,96,27,36,48,60,72,
88,96,27,36,48,60,27,27,36,48,60,
72,88,96,27,36,48,60,72,88,96,27,
36,48,60,72,88,96)
Diameter<-c(13.9,16.2,
19.1,19.3,4.7,6.7,9.6,11.2,13.1,15.3,
15.4,5.4,7,9.9,11.7,13.4,16.1,16.2,
5.9,8.3,12.3,14.5,2.3,5.2,6.2,8.6,9.3,
11.3,15.1,15.5,5,7,7.9,8.4,10.5,14,14,
4.1,4.9,6,6.7,7.7,8,8.2)
d<-dados <- data.frame(Feature,Age,Diameter)
str(d)
I will create three different models, two non-linear models with specific parametization and one linear model. In my example
a suppose that all the coefficients of each mode were significant (and not considering real results).
# Model 1 non-linear
e1<- Diameter ~ a1 * Age^a2
#Algoritm Levenberg-Marquardt
m1 <- nlsLM(e1, data = d,
start = list(a1 = 0.1, a2 = 10),
control = nls.control(maxiter = 1000))
# Model 2 linear
m2<-lm(Diameter ~ Age, data=d)
# Model 3 another non-linear
e2<- Diameter ~ a1^(-Age/a2)
m3 <- nls2(e2, data = d, alg = "brute-force",
start = data.frame(a1 = c(-1, 1), a2 = c(-1, 1)),
control = nls.control(maxiter = 1000))
Now, my idea is comparing the "better" model despite the different nature of each model, than I try a proportional measure
and for this I use each mean square error of each model comparing of total square error in data set, when a make this I have if
a comparing model 1 and 2:
## MSE approach (like pseudo R2 approach)
#Model 1
SQEm1<-summary(m1)$sigma^2*summary(m1)$df[2]# mean square error of model
SQTm1<-var(d$Diameter)*(length(d$Diameter)-1)#total square error in data se
R1<-1-SQEm1/SQTm1
R1
#Model 2
SQEm2<-summary(m2)$sigma^2*summary(m2)$df[2]# mean square error of model
R2<-1-SQEm2/SQTm1
R2
In my weak opinion model 1 is "better" that model 2. My question is, does this approach sounds correct? Is there any way to compare these models types?
Thanks in advance!
#First cross-validation approach ------------------------------------------
#Cross-validation model 1
set.seed(123) # for reproducibility
n <- nrow(d)
frac <- 0.8
ix <- sample(n, frac * n) # indexes of in sample rows
e1<- Diameter ~ a1 * Age^a2
#Algoritm Levenberg-Marquardt
m1 <- nlsLM(e1, data = d,
start = list(a1 = 0.1, a2 = 10),
control = nls.control(maxiter = 1000), subset = ix)# in sample model
BOD.out <- d[-ix, ] # out of sample data
pred <- predict(m1, new = BOD.out)
act <- BOD.out$Diameter
RSS1 <- sum( (pred - act)^2 )
RSS1
#[1] 56435894734
#Cross-validation model 2
m2<-lm(Diameter ~ Age, data=d,, subset = ix)# in sample model
BOD.out2 <- d[-ix, ] # out of sample data
pred <- predict(m2, new = BOD.out2)
act <- BOD.out2$Diameter
RSS2 <- sum( (pred - act)^2 )
RSS2
#[1] 19.11031
# Sum of squares approach -----------------------------------------------
deviance(m1)
#[1] 238314429037
deviance(m2)
#[1] 257.8223
Based in gfgm and G. Grothendieck comments, RSS2 has lower error that RSS1 and comparing deviance(m2) and deviance(m2) too, than model 2 is better than model 1.
I want to estimate a binomial model with the R package MCMCglmm. The model shall incorporate an intercept and a slope - both as fixed and random parts. How do I have to specify an accepted prior? (Note, here is a similar question, but in a much more complicated setting.)
Assume the data have the following form:
y x cluster
1 0 -0.56047565 1
2 1 -0.23017749 1
3 0 1.55870831 1
4 1 0.07050839 1
5 0 0.12928774 1
6 1 1.71506499 1
In fact, the data have been generated by
set.seed(123)
nj <- 15 # number of individuals per cluster
J <- 30 # number of clusters
n <- nj * J
x <- rnorm(n)
y <- rbinom(n, 1, prob = 0.6)
cluster <- factor(rep(1:nj, each = J))
dat <- data.frame(y = y, x = x, cluster = cluster)
The information in the question about the model, suggest to specify fixed = y ~ 1 + x and random = ~ us(1 + x):cluster. With us() you allow the random effects to be correlated (cf. section 3.4 and table 2 in Hadfield's 2010 jstatsoft-article)
First of all, as you only have one dependent variable (y), the G part in the prior (cf. equation 4 and section 3.6 in Hadfield's 2010 jstatsoft-article) for the random effects variance(s) only needs to have one list element called G1. This list element isn't the actual prior distribution - this was specified by Hadfield to be an inverse-Wishart distribution. But with G1 you specify the parameters of this inverse-Whishart distribution which are the scale matrix ( in Wikipedia notation and V in MCMCglmm notation) and the degrees of freedom ( in Wikipedia notation and nu in MCMCglmm notation). As you have two random effects (the intercept and the slope) V has to be a 2 x 2 matrix. A frequent choice is the two dimensional identity matrix diag(2). Hadfield often uses nu = 0.002 for the degrees of freedom (cf. his course notes)
Now, you also have to specify the R part in the prior for the residual variance. Here again an inverse-Whishart distribution was specified by Hadfield, leaving the user to specify its parameters. As we only have one residual variance, V has to be a scalar (lets say V = 0.5). An optional element for R is fix. With this element you specify, whether the residual variance shall be fixed to a certain value (than you have to write fix = TRUE or fix = 1) or not (then fix = FALSE or fix = 0). Notice, that you don't fix the residual variance to be 0.5 by fix = 0.5! So when you find in Hadfield's course notes fix = 1, read it as fix = TRUE and look to which value of V it is was fixed.
All togehter we set up the prior as follows:
prior0 <- list(G = list(G1 = list(V = diag(2), nu = 0.002)),
R = list(V = 0.5, nu = 0.002, fix = FALSE))
With this prior we can run MCMCglmm:
library("MCMCglmm") # for MCMCglmm()
set.seed(123)
mod0 <- MCMCglmm(fixed = y ~ 1 + x,
random = ~ us(1 + x):cluster,
data = dat,
family = "categorical",
prior = prior0)
The draws from the Gibbs-sampler for the fixed effects are found in mod0$Sol, the draws for the variance parameters in mod0$VCV.
Normally a binomial model requires the residual variance to be fixed, so we set the residual variance to be fixed at 0.5
set.seed(123)
prior1 <- list(G = list(G1 = list(V = diag(2), nu = 0.002)),
R = list(V = 0.5, nu = 0.002, fix = TRUE))
mod1 <- MCMCglmm(fixed = y ~ 1 + x,
random = ~ us(1 + x):cluster,
data = dat,
family = "categorical",
prior = prior1)
The difference can be seen by comparing mod0$VCV[, 5] to mod1$VCV[, 5]. In the later case, all entries are 0.5 as specified.
I am new user of R and hope you will bear with me if my question is silly. I want to estimate the following model using the maximum likelihood estimator in R.
y= a+b*(lnx-α)
Where a, b, and α are parameters to be estimated and X and Y are my data set. I tried to use the following code that I get from the web:
library(foreign)
maindata <- read.csv("C:/Users/NUNU/Desktop/maindata/output2.csv")
h <- subset(maindata, cropid==10)
library(likelihood)
modelfun <- function (a, b, x) { b *(x-a)}
par <- list(a = 0, b = 0)
var<-list(x = "x")
par_lo <- list(a = 0, b = 0)
par_hi <- list(a = 50, b = 50)
var$y <- "y"
var$mean <- "predicted"
var$sd <- 0.815585
var$log <- TRUE
results <- anneal(model = modelfun, par = par, var = var,
source_data = h, par_lo = par_lo, par_hi = par_hi,
pdf = dnorm, dep_var = "y", max_iter = 20000)
The result I am getting is similar although the data is different, i.e., even when I change the cropid. Similarly, the predicted value generated is for x rather than y.
I do not know what I missed or went wrong. Your help is highly appreciated.
I am not sure if your model formula will lead to a unique solution, but in general you can find MLE with optim function
Here is a simple example for linear regression with optim:
fn <- function(beta, x, y) {
a = beta[1]
b = beta[2]
sum( (y - (a + b * log(x)))^2 )
}
# generate some data for testing
x = 1:100
# a = 10, b = 3.5
y = 10 + 3.5 * log(x)
optim(c(0,0,0),fn,x=x,y=y,method="BFGS")
you can change the function "fn" to reflect your model formula e.g.
sum( (y - (YOUR MODEL FORMULA) )^2 )
EDIT
I am just giving a simple example of using optim in case you have a custom model formula to optimize. I did not mean using it from simple linear regression, since lm will be sufficient.
I was a bit surprised that iTech used optim for what is a problem that is linear in its parameters. With his data for x and y:
> lm(y ~ log(x) )
Call:
lm(formula = y ~ log(x))
Coefficients:
(Intercept) log(x)
10.0 3.5
For linear problems, the least squares solution is the ML solution.