I was playing around with the nlsLM function, from the minpack.lm library, and encountered some behaviour that I don't understand.
Given that the following function produces output when I supply a numeric vector 'b' as input I wanted to use this function to fit a nonlinear model to my data.
volEquation <- function(DBH, PHt, b){
b[1] * DBH^b[2] * PHt^b[3]
}
However I have become stuck when it comes to correctly specifying the initial parameter values. R code follows:
library(minpack.lm)
n <- 20
x <- seq(12, 60, length.out = n)
y <- seq(22, 45, length.out = n)
z <- x^2 * y ^ 3 + rnorm(n, 0, 0.1)
Data <- data.frame(DBH = x, PHt = y, TVT = z)
nlsFormula <- "TVT ~ volEquation(DBH, PHt, b)"
nlsInitial <- list(b = c(0.5, 2.25, 3.25))
nlsLMOutput <- nlsLM(formula = nlsFormula, data = Data, start = nlsInitial)
nlsOutput <- nls(formula = nlsFormula, data = Data, start = nlsInitial
nls was successful at fitting the data while nlsLM gave me this error message,
Error in rownames<-(*tmp*, value = "b") :
length of 'dimnames' [1] not equal to array extent
Can anyone provide insight as to why this problem occurs in the nlsLM function? I've tried sifting through the nlsLM code but I still don't understand what's going on.
Try separating your parameters
volEquation <- function(DBH, PHt, x,y,z){
x * DBH^y * PHt^z
}
nlsFormula <- "TVT ~ volEquation(DBH, PHt, x, y, z)"
nlsInitial <- c(x=5e-3, y=2, z=1)
nlsOutput <- nlsLM(formula = nlsFormula, data = Data, start = nlsInitial, control=nls.lm.control(maxiter=100))
Related
library(deSolve)
require(deSolve)
delta_conc <- function(time, current_state, params) {
with(as.list(c(current_state, params)),{
dX <- Y
dY <- X - X^3 - 0.25*Y + A * sin(t)
return(list(c(dX, dY)))
})
}
params <- c(
A <- 0.2645
)
initial_state <- c(
X <- 0.9,
Y <- 0.4
)
times <- 1:10
model <- ode(initial_state, times, delta_conc, params)
summary(model)
matplot(model, type="l",lty=1, main="Enzyme model", xlab="Time")
I get this error message when I try to run it:
Error in checkFunc(Func2, times, y, rho) :
The number of derivatives returned by func() (21) must equal the length of the initial conditions vector (2)
When I exclude the 'sin(t)' part it works, so the problem is with that part, but I'm very much a beginner so I have no idea how to approach this problem
You should consistently use einer t or time for the actual time step. In your case t is not defined as variable, so tis interpreted as transpose-function.
The following should work:
require(deSolve)
delta_conc <- function(time, current_state, params) {
with(as.list(c(current_state, params)),{
dX <- Y
dY <- X - X^3 - 0.25*Y + A * sin(time)
return(list(c(dX, dY)))
})
}
params <- c(
A = 0.2645
)
initial_state <- c(
X = 0.9,
Y = 0.4
)
times <- 1:10
model <- ode(initial_state, times, delta_conc, params)
summary(model)
matplot.0D(model, type="l",lty=1, main="Enzyme model", xlab="Time")
In addition, the code had also some other issues:
use either require or library and not both
use = within c(). It is parameter matching and not assignment
Two additional suggestions:
you can use the deSolve-built in plot function matplot.0D
I would recommend to use times <- seq(0, 10, length.out = 100) instead of 1:10. This way the plot will get smooth. Starting time with 1 (or another value) may be ok, but is often more convenient to start time with zero.
I am using allFit() in lme4 to automatically scan through possible optimizers, since the default optimizer usually doesn't converge in this situation. My code works fine when I run it line-by-line, but when I run it wrapped in a simple function, it gives different results.
I've looked at the output of the allFit call and it seems that when it's NOT inside the function, it returns a list of lmerModLmerTest objects as desired.
However, inside the function, it returns a list with the values simpleError, error, and condition. Why is it doing this?
I'm using RStudio, R 3.6, lme4 1.1-21, lmerTest 3.1-0.
UPDATE: The problem is that the update() method used by allFit cannot find the 'tt' data frame when re-fitting the models. I have put breakpoints into the code and it seems that the 'test' data exists in the function environment, however, so I don't understand why it can't find it...
UPDATE 2: It appears that if I change the assignment of the test data to <<-, it works. This is dangerous, though, by breaking functional programming, and I think it may fail when I try to parallelize. I am testing further... still open to suggestions!
Here is the code that works, not inside the function:
library(lme4)
multi_arm_var_sim <- function(nsub = 20, nclust = 100, narm = 2, iccs = c(.01, .04), betas = c(0,.3)){
sig_b2 <- -1*iccs / (iccs - 1)
n <- nsub * nclust * narm
y <- rep_len(NA, n)
arm <- as.factor(rep(0:(narm-1), each = nsub*nclust))
clustid <- rep(1:(nclust*narm), each = nsub)
clustRElist <- rnorm(narm*nclust, mean = 0, sd = rep(sqrt(sig_b2), each = nclust))
clustRE <- rep(clustRElist, each = nsub)
sig_b2 <- rep(sig_b2, each = nclust*nsub)
error <- rnorm(n, mean = 0, sd = 1)
beta <- rep(betas, each = nclust*nsub)
linpred <- beta + clustRE + error
output <- cbind.data.frame(arm, clustid, sig_b2, clustRE, linpred)
return(output)
}
set.seed(2)
test_1 <- multi_arm_var_sim()
model_flex_1 <- lmer(linpred ~ arm + (1 + arm | clustid),
data = test_1)
diff_optims_1 <- allFit(model_flex_1, verbose = TRUE)
print(class(diff_optims_1[[1]]))
is.OK_1 <- sapply(diff_optims_1, is, "lmerMod")
print(is.OK_1)
And here is the code that doesn't work, same setup, wrapped in a function.
library(lme4)
multi_arm_var_sim <- function(nsub = 20, nclust = 100, narm = 2, iccs = c(.01, .04), betas = c(0,.3)){
sig_b2 <- -1*iccs / (iccs - 1)
n <- nsub * nclust * narm
y <- rep_len(NA, n)
arm <- as.factor(rep(0:(narm-1), each = nsub*nclust))
clustid <- rep(1:(nclust*narm), each = nsub)
clustRElist <- rnorm(narm*nclust, mean = 0, sd = rep(sqrt(sig_b2), each = nclust))
clustRE <- rep(clustRElist, each = nsub)
sig_b2 <- rep(sig_b2, each = nclust*nsub)
error <- rnorm(n, mean = 0, sd = 1)
beta <- rep(betas, each = nclust*nsub)
linpred <- beta + clustRE + error
output <- cbind.data.frame(arm, clustid, sig_b2, clustRE, linpred)
return(output)
}
get_pval <- function(){
tt <- multi_arm_var_sim()
model_flex <- lme4::lmer(linpred ~ arm + (1 + arm | clustid),
data = tt)
diff_optims <- lme4::allFit(model_flex, data = tt, verbose = TRUE)
print(class(diff_optims[[1]]))
is.OK <- sapply(diff_optims, is, "merMod")
print(is.OK)
}
set.seed(2)
get_pval()
Thanks!!
does anyone can help to write a function which can return the x value of the loess smooth? I did like follows, but seems wrong. What I am want to get is the x-value, which yield the maximum Y in the loess function. Thanks in advance.
myFmsy<-function(x,y){
model <- loess(y ~ x,span = 0.4)
return(x[which(y==max(y))])
}
The problem is that you are fitting a model and then not using it at all.
The return value of loess is a list (of class "loess") with a member fitted. This is the vector where you want to find the maximum.
myFmsy <- function(x, y){
model <- loess(y ~ x,span = 0.4)
yfit <- model$fitted
x[which(yfit == max(yfit))]
}
set.seed(6589) # Make the results reproducible
x <- rnorm(100)
y <- rnorm(100)
myFmsy(x, y)
#[1] -0.938093
There might be cases where due to floating-point issues several values are close to each other, whithin a given tolerance. The following function checks this and also returns the fitted y and the index ix of where it can be found.
myFmsy2 <- function(x, y, tol = .Machine$double.eps^0.5){
model <- loess(y ~ x,span = 0.4)
yfit <- model$fitted
inx <- which(abs(yfit - max(yfit)) < tol)
list(x = x[inx], y.fitted = yfit[inx], ix = inx)
}
myFmsy2(x, y)
#$`x`
#[1] -0.938093
#
#$y.fitted
#[1] 0.5046313
#
#$ix
#[1] 48
I'm doing Maximum Likelihood Estimation using maxLik, which requires specifying starting values. Instead of specifying a single value, is there any way that allows me to use all the values from a matrix as the start value?
My current code of maxLik is:
f12 <- function(param){
alpha <- param[1]
rho <- param[2]
lambda <- param[3]
u <- 0.5*(p12$v_50_1)^alpha + 0.5*lambda*(p12$v_50_2)^alpha
p <- 1/(1 + exp(-rho*u))
f <- sum(p12$gamble*log(p) + (1-p12$gamble)*log(1-p))}
ml <- maxLik(f12, start = c(alpha = 1, rho=2, lambda = 1), method = "NM")
I create a dataframe with the upper and lower bounds of potential start values:
st <- expand.grid(alpha = seq(0, 2, len = 100),rho = seq(0, 1, len = 100),lambda = seq(0,2, length(100))
There are 3 parameters in my function, and my goal is to loop all the values in the above dataframe st and select the best vector of start values after running the model from a variety of starting parameters.
Thanks!
Consider Map (wrapper to mapply) to pass the st columns elementwise through your methods. Here, Map will return a list of maxLik objects, specifically inherited maxim class objects containing a list of other components. The number of items in this list will be equal to rows of st.
Notice input parameters, a, r, and l being passed into start argument of maxLik() and no longer hard-coded integers. And f12 is left untouched.
maxLik_run <- function(a, r, l) {
tryCatch({
f12 <- function(param){
alpha <- param[1]
rho <- param[2]
lambda <- param[3]
u <- 0.5*(p12$v_50_1)^alpha + 0.5*lambda*(p12$v_50_2)^alpha
p <- 1/(1 + exp(-rho*u))
f <- sum(p12$gamble*log(p) + (1-p12$gamble)*log(1-p))
}
return(maxLik(f12, start = c(alpha = a, rho = r, lambda = l), method = "NM"))
}, error = function(e) return(NA))
}
st <- expand.grid(alpha = seq(0, 2, len = 100),
rho = seq(0, 1, len = 100),
lambda = seq(0, 2, length(100)))
maxLik_list <- Map(maxLik_run, st$alpha, st$rho, st$lambda)
And to answer the question --best vector of start values after running the model from a variety of starting parameters-- requires a particular definition of "best". Once you define this, you can use Filter() on your returned list of objects to select the one or more element that yields this "best".
Below is a demonstration to find the highest value across each maximum likelihood's maximum. Use estimate if needed. Do note, this returned list can have more than one if the highest value is shared by other list items:
highest_value <- max(sapply(maxLik_list, function(item) item$maximum))
maxLik_item_list <- Filter(function(i) i$maximum == highest_value, maxLik_list)
What you are doing in your logLik function is that you are calculating alpha,lambda,rho whereas your data already has them.Those are the lines with u,p and f12(that is also your function name!). Also it is possible to calculate log likelihood for one row as your log likelihood function has single indices. So you run the code using apply like this
#create a function to find mle estimate for first row
maxlike <- function(a) {
f12 <- function(param){
alpha <- param[1]
rho <- param[2]
lambda <- param[3]
#u <- 0.5*(p12$v_50_1)^alpha + 0.5*lambda*(p12$v_50_2)^alpha
#p <- 1/(1 + exp(-rho*u))
#f12 <- sum(p12$gamble*log(p) + (1-p12$gamble)*log(1-p))
}
ml <- maxLik(f12, start = c(alpha = 1, rho=2, lambda = 1), method = "NM")
}
#then using apply with data = st, 2 means rows and your mle function
mle <- apply(st,2,maxlike)
mle
So, I have these functions:
funk1 <- function(a,x,l,r) {
x^2*exp(-(l*(1-exp(-r*a))/r))}
funk2 <- function(x,l,r) {
sapply(x, function (s) {
integrate(funk1, lower = 0, upper = s, x=s, l=l, r=r)$value })}
which are used to explain the data y in,
z <- data.frame(ts = 1:100,
y = funk2(1:100, l = 1, r = 1) + rpois(100, 1:100))
I wish to use optim to maximise the likelihood, so I defined a likelihood function:
LL_funk <- function(l,r) {
n=nrow(z)
R = sum((funk2(ts,l,r) - y)^2)
logl = -((n/2)*log(R))
return(-logl)
}
and I tried to fit using optim
fit <- optim(par=c(0.5,0.5), fn= LL_funk, method="Nelder-Mead")
But I get an error:
Error in integrate(funk1, lower = 0, upper = s, x = s, l = l, r = r) :
a limit is missing
I am not sure why? I could run nls fitting funk2(x,l,r) to y
nls(y ~ funk2(ts,l,r), data = z, start = list(l = 0.5, r = 0.5))
That means funk2 is working. I guess its the problem with LL function that I have designed, which I cant figure out!! Please Help!
Yup! There were two problems with your function. This worked for me:
LL_funk <- function(params) {
n=nrow(z)
l = params[1]
r = params[2]
R = sum((funk2(z$ts,l,r) - z$y)^2)
logl = -((n/2)*log(R))
return(-logl)
}
Previous issues:
LL_funk only takes 1 argument, which is the vector of parameters.
In LHS of the assignment of R, ts and y were not actually referring to columns in your dataset.