I'm fitting some exponential data using nls.
The code I'm using is:
fit <- nls(y ~ expFit(times, A, tau, C), start = c(A=100, tau=-3, C=0))
expFit is defined as
expFit <- function(t, A, tau, C)
{
expFit <- A*(exp(-t/tau))+C
}
This works well for most of my data, for which the starting parameters provided (100, -3 and 0) work well. Sometimes, though, I have data that doesn't go well with those parameters and I get errors from nls (e.g. "singular gradient" or things like that). How do I "catch" these errors?
I tried to do something like
fit <- NULL
fit <- nls(...)
if (is.null(fit))
{
// Try nls with other starting parameters
}
But this won't work because nls seems to stop the execution and the code after nls will not execute...
Any ideas?
Thanks
nico
I usually use this trick:
params<-... # setup default params.
while(TRUE){
fit<-NULL
try(fit<-nls(...)); # does not stop in the case of error
if(!is.null(fit))break; # if nls works, then quit from the loop
params<-... # change the params for nls
}
Related
I'm trying to write a set of functions where the first function fits a cox model (via coxph in the survival package in R), and the second function gets estimated survival for a new dataset, given the fitted model object from the first function. I'm running into some sort of scoping issue that I don't quite know how to solve without substantially re-factoring my code (the only way I could think to do it would be much less general and much harder to read).
I have a very similar set of functions that are based on the glm function that do not run into the same issue and give me the answers I would expect. I've included a short worked example below that demonstrates the issue. The glue.cox and glue.glm are functions that have the basic functionality I am trying to get. glue.glm works as expected (yielding the same values from a calculation in the global environment), but the glue.cox complains that it can't find the data that was used to fit the cox model and ends with an error. I don't understand how to do this with substitute but I suspect that is the way forward. I've hit a wall with experimenting.
library(survival)
data.global = data.frame(time=runif(20), x=runif(20))
newdata.global = data.frame(x=c(0,1))
f1 = Surv(time) ~ x # this is the part that messes it up!!!!! Surv gets eval
f2 = time ~ x # this is the part that messes it up!!!!! Surv gets eval
myfit.cox.global = coxph(f1, data=data.global)
myfit.glm.global = glm(f2, data=data.global)
myfit.glm.global2 = glm(time ~ x, data=data.global)
myfit.cox <- function(f, dat.local){
coxph(f, data=dat.local)
}
myfit.glm <- function(f, dat.local){
glm(f, data=dat.local)
}
mypredict.cox <- function(ft, dat.local){
newdata = data.frame(x=c(0,1))
tail(survfit(ft, newdata)$surv, 1)
}
mypredict.glm <- function(ft, dat.local){
newdata = data.frame(x=c(0,1))
predict(ft, newdata)
}
glue.cox <- function(f, dat.local){
fit = myfit.cox(f, dat.local)
mypredict.cox(fit, dat.local)
}
glue.glm <- function(f, dat.local){
fit = myfit.glm(f, dat.local)
mypredict.glm(fit, dat.local)
}
# these numbers are the goal for non-survival data
predict(myfit.glm.global, newdata = newdata.global)
0.5950440 0.4542248
glue.glm(f2, data.global)
0.5950440 0.4542248 # this works
# these numbers are the goal for survival data
tail(survfit(myfit.cox.global, newdata = newdata.global)$surv, 1)
[20,] 0.02300798 0.03106081
glue.cox(f1, data.global)
Error in eval(predvars, data, env) : object 'dat.local' not found
This appears to work, at least in the narrow sense of making glue.cox() work as desired:
myfit.cox <- function(f, dat.local){
environment(f) <- list2env(list(dat.local=dat.local))
coxph(f, data=dat.local)
}
The trick here is that most R modeling/model-processing functions look for data in the environment associated with the formula.
I don't know why glue.glm works without doing more digging, except for the general statement that [g]lm objects store more of the information needed for downstream processing internally (e.g. in the $qr element) than other model types.
rsq <- function(formula, Data1, indices) {
d <- Data1[indices,] # allows boot to select sample
fit <- lm(formula, Data1=d)
return(summary(fit)$r.square)
}
results = boot(data = Data1, statistic = rsq, R = 500)
When I execute the code, I get the following error:
Error in Data1[indices,] : incorrect number of dimensions
Background info: I am creating a predictive model using Linear Regressions. I would like to test my Predictive Model and through some research, I decided to use the Bootstrapping Method.
Credit goes to #Rui Barradas, check comments for original post.
If you read the help page for function boot::boot you will see that the function it calls has first argument data, then indices, then others. So change the order of your function definition to rsq <- function(Data1, indices, formula)
Another problem that I had was that I didn't define the Function.
I am trying to find the optimal "lambda" parameter for the Box-Cox transformation.
I am using the implementation from the MASS package, so I only need to create the model and extract the lambda.
Here is the code for the function:
library(MASS)
find_lambda <- function(x) {
# Function to find the best lambda for the Box-Cox transform
my_tmp <- data.frame(x = x) # Create a temporary data frame, to use it with the lm
str(my_tmp) # Gives the expected output
the_lm <- lm(x ~ 1, data = my_tmp) # Creates the linear model, no error here
print(summary(the_lm)) # Prints the summary, as expected
out <- boxcox(the_lm, plotit=FALSE) # Gives the error
best_lambda <- out$x[which.max(out$y)] # Extracting the best fitting lambda
return(best_lambda)
}
find_lambda(runif(100))
It gives the following error:
Error in is.data.frame(data) : object 'my_tmp' not found
The interesting thing is that the very same code is working outside the function. In other words, for some reason, the boxcox function from the MASS package is looking for the variable in the global environment.
I don't really understand, what exactly is going on... Do you have any ideas?
P.S. I do not provide a software/hardware specification, since this error was sucessfully replicated on a number of my friends' laptops.
P.P.S. I have found the way to solve the initial problem in the forecast package, but I still would like to know, why this code is not working.
Sometimes user contributed packages don't always do a great job tracking the environments where calls were executed when manipulating functions calls. The quickest fix for you would be to change the line from
the_lm <- lm(x ~ 1, data = my_tmp)
to
the_lm <- lm(x ~ 1, data = my_tmp, y=True, qr=True)
Because if the y and qr are not requested from the lm call, the boxcox function tries to re-run lm with those parameters via an update call and things get mucked up inside a function scope.
Why don't let box-cox do the fitting?
find_lambda <- function(x) {
# Function to find the best lambda for the Box-Cox transform
my_tmp <- data.frame(x = x) # Create a temporary data frame, to use it with the lm
out <- boxcox(x ~ 1, data = my_tmp, plotit=FALSE) # Gives the error
best_lambda <- out$x[which.max(out$y)] # Extracting the best fitting lambda
return(best_lambda)
}
I think your scoping issue is with update.default which calls eval(call, parent.frame()) and my_tmp doesn't exist in the boxcox environment. Please correct me if I'm wrong on this.
boxcox cannot find your data. This maybe because of some scoping issue.
You can feed data in to boxcox function.
find_lambda <- function(x) {
# Function to find the best lambda for the Box-Cox transform
my_tmp <- data.frame(x = x) # Create a temporary data frame, to use it with the lm
str(my_tmp) # Gives the expected output
the_lm <- lm(x ~ 1, data = my_tmp) # Creates the linear model, no error here
print(summary(the_lm)) # Prints the summary, as expected
out <- boxcox(the_lm, plotit=FALSE, data = my_tmp) # feed data in here
best_lambda <- out$x[which.max(out$y)] # Extracting the best fitting lambda
return(best_lambda)
}
find_lambda(runif(100))
I realized a strange behavior today with in my R code.
I tried a package {boot.StepAIC} which includes a bootstrap function for the results of the stepwise regression with the AIC. However I do not think the statistical background is here the problem (I hope so).
I can use the function at the top level of R. This is my example code.
require(MASS)
require(boot.StepAIC)
n<-100
x<-rnorm(n); y<-rnorm(n,sd=2); z<-rnorm(n,sd=3); res<-x+y+z+rnorm(n,sd=0.1)
dat.test<-as.data.frame(cbind(x,y,z,res))
form.1<-as.formula(res~x+y+z)
boot.stepAIC(lm(form.1, dat.test),dat.test) # should be OK - works at me
However, I wanted to wrap that in an own function. I pass the data and the formula to that function. But I get an error within boot.stepAIC() saying:
the model fit failed in 100 bootstrap samples Error in
strsplit(nam.vars, ":") : non-character argument
# custom function
fun.boot.lm.stepAIC<-function(dat,form) {
if(!inherits(form, "formula")) stop("No formula given")
fit.lm<-lm(formula=form,data=dat)
return(boot.stepAIC(object=fit.lm,data=dat))
}
fun.boot.lm.stepAIC(dat=dat.test,form=form.1)
# results in an error
So where is the mistake? I suppose it must have something to do with the local and global environment, doesn't it?
Using do.call as in anova test fails on lme fits created with pasted formula provides the answer.
boot.stepAIC doesn't have access to form when run within a function; that can be recreated in the global environment like this; we see that lm is using form.1 as the formula, and removing it makes boot.stepAIC fail.
> form.1<-as.formula(res~x+y+z)
> mm <- lm(form.1, dat.test)
> mm$call
lm(formula = form.1, data = dat.test)
> rm(form.1)
> boot.stepAIC(mm,dat.test)
# same error as OP
Using do.call does work. Here I use as.name as well; otherwise the mm object carries around the entire dataset instead of just the name of it.
> form.1<-as.formula(res~x+y+z)
> mm <- do.call("lm", list(form.1, data=as.name("dat.test")))
> mm$call
lm(formula = res ~ x + y + z, data = dat.test)
> rm(form.1)
> boot.stepAIC(mm,dat.test)
To apply this to the original problem, I'd do this:
fun.boot.lm.stepAIC<-function(dat,form) {
if(!inherits(form, "formula")) stop("No formula given")
mm <- do.call("lm", list(form, data=as.name(dat)))
do.call("boot.stepAIC", list(mm,data=as.name(dat)))
}
form.1<-as.formula(res~x+y+z)
fun.boot.lm.stepAIC(dat="dat.test",form=form1)
This works too but the entire data set gets included in the final output object, and the final output to console, as well.
fun.boot.lm.stepAIC<-function(dat,form) {
if(!inherits(form, "formula")) stop("No formula given")
mm <- do.call("lm", list(form, data=dat))
boot.stepAIC(mm,data=dat)
}
form.1<-as.formula(res~x+y+z)
fun.boot.lm.stepAIC(dat=dat.test,form=form.1)
In my r script, I do perform an nls to get a fit value:
fit <- nls(...)
and then after that, I test if the nls succeeded by doing this:
if(is.na(fit)) {
print("succeeded")
}
but I get warnings:
the condition has length > 1 and only the first element will be used
am I doing this wrong? if so, what should I do? if not, how do I remove the warning? thanks!
nls induces an error if the fitting failed. So, is.null after try(nls(...)) is the correct way.
here is a piece of code I used when using nls fit for uncertain data:
fit <- NULL
while (TRUE) {
start <- list(...) # try somewhat randomized initial parameter
try(fit <- nls(..., start = start)) # performe nls
if (!is.null(fit)) break;
}