Nonlinear regression in R shows error - r

I am using the R function nlsLM from the package minpack.LM and I have the following error.
I generate my own signal with noise, so I know all parameters, which I'am trying to find doing regression analysis using the same function, I've used to generate signal.
The problem is, that nlsLM function runs fine, and it even could find right parameters values, but at last, when it finds them, error appear like this:
It. 23, RSS = 14.4698, Par. = 42.6727 0.78112 1 65.2211 15.6065 1
It. 24, RSS = 14.4698, Par. = 42.671 0.781102 1 65.2212 15.6069 1
Error in stats:::nlsModel(formula, mf, start, wts) :
singular gradient matrix at initial parameter estimates
And I do not know what to do.
Please explain what it could be, and how I could solve it!
Additional information:
#This is how i generate my signal (it is convolution of gaussian with exp(-kt)
set.seed(100)
Yexp=sim_str_exp(error=10)
time=Yexp[[1]]
y=Yexp[[2]]
dataset_nls=data.frame(time,y)
start=c(tau1=.5,beta1=.5,exp_A1=.5,gaus_pos=.5,gaus_width=.5,gaus_A=0.5)
lower=c(tau1=0.01,beta1=0.01,exp_A1=0.01,gaus_pos=0.01,gaus_width=0.01,gaus_A=0.01)
upper=c(tau1=100,beta1=1,exp_A1=1,gaus_pos=100,gaus_width=850,gaus_A=1)
#here i do fitting
FIT=nlsLM(y ~ str_exp_model(time,tau1,beta1,exp_A1,gaus_pos,gaus_width,gaus_A),data=dataset_nls,start=start,lower=lower,upper=upper,trace=TRUE,algorithm="LM",na.action=na.pass,control=nls.lm.control(maxiter=200,nprint=1))
#Model_function
str_exp_model<-function(time, tau1,beta1,exp_A1,gaus_pos,gaus_width,gaus_A){
F_gen_V<-vector(length=length(time))
F_gaus_V=vector(length=length(time))
F_exp_V=vector(length=length(time))
for (i in 1:length(time)) {
F_gaus_V[i]=gaus_A*exp(-2.77*((i-gaus_pos)/gaus_width)^2)
F_exp_V[i]=exp_A1*exp(-1*(i/tau1)^beta1)
}
convolve(F_gaus_V, F_exp_V,FALSE)
}
function for signal generation
sim_str_exp<- function(num_points=512,time_scale=512,tau1=45,beta1=.80,exp_A1=1,gaus_pos=65,
gaus_width=15, gaus_A=1,Y0=0, error=2.0, show_graph=TRUE, norm="False"){
F_gen_V<-vector(length=num_points)
time_gen_V<-vector(length=num_points)
F_gaus_V=vector(length=num_points)
F_exp_V=vector(length=num_points)
ts=time_scale/num_points
sigma=vector(length=num_points)
for (i in 1:num_points) {
F_gaus_V[i]=gaus_A*exp(-2.77*((i*ts-gaus_pos)/gaus_width)^2)
F_exp_V[i]=exp_A1*exp(-1*(i*ts/tau1)^beta1)
time_gen_V[i]=i*ts
}
F_gen_V<-(convolve(F_gaus_V, F_exp_V,FALSE))+Y0
if(norm==TRUE){
F_gen_V=F_gen_V/max(F_gen_V)}
else{;}
error_V=runif(512,-1*error, error)
for(i in 1:num_points){
F_gen_V[i]=error_V[i]/100*F_gen_V[i]+F_gen_V[i]
sigma[i]=(error_V[i]/100*F_gen_V[i])
}
RETURN=list(time=time_gen_V,y=F_gen_V,sigma=sigma)
if (show_graph==TRUE){
plot(RETURN[[1]],RETURN[[2]], type="l", main="Generated signal with noise",xlab="time, pixel",ylab="Intensity");}
else {;}
return(RETURN)
}

You haven't shown us sim_str_exp, so this example isn't fully reproducible, but I'm going to take a guess here. You say "I generate my own signal with noise", but you use Yexp=sim_str_exp(error=0) to generate the data, so it looks like you're not in fact adding any noise. (Also, your reported RSS at the last step is 1.37e-28 ...)
My guess is that you're running into a problem documented in ?nls, which is that nls() doesn't work well when there is zero noise. This is not documented in ?nlsLM, but I wouldn't be surprised if it held there too.
For convenience, here is the section I'm referring to from ?nls:
Do not use ‘nls’ on artificial "zero-residual" data.
The ‘nls’ function uses a relative-offset convergence criterion
that compares the numerical imprecision at the current parameter
estimates to the residual sum-of-squares. This performs well on
data of the form
y = f(x, theta) + eps
(with ‘var(eps) > 0’). It fails to indicate convergence on data
of the form
y = f(x, theta)
because the criterion amounts to comparing two components of the
round-off error. If you wish to test ‘nls’ on artificial data
please add a noise component, as shown in the example below.
If my hypothesis is correct then you should be able to get a fit without errors if you set the noise amplitude greater than zero.

Related

How Can I obtain HAC standard errors for my VAR model?

I've seen similar questions to this but the solutions don't seem to apply to my situation.
Anyway, my data looks like this. All variables are I(0) after first differencing.
diff.data.gdpg. diff.data.narrowmg. diff.data.inf. diff.data.stir. diff.data.xrusd.
1 -0.51271298 -1.823265 -1.6304108 -1.0116667 -1.1520946
2 -0.04111672 2.799135 -0.3754515 -0.8033333 -3.8242471
3 -1.27394110 1.171467 -1.0167953 -0.7600000 -0.3483001
4 -1.23568342 3.327228 -0.6832069 -0.9600000 -1.1126535
5 2.92195504 4.291975 0.8145149 -0.7100000 0.4041784
6 1.79054994 2.522487 0.9598156 0.6266667 0.3260302
( I know that it is technically not advised to first difference when using VAR but I'm not concerned with forecasting here. I just want to get an idea of the relationships across these variables).
Anyway, I use the vars package and run my VAR. I save the lmobject I would like to get the standard errors of to another variable.
varmodel <- VAR(newdata, p = 4, type = "const")
lmobject <- varmodel$varresult$diff.data.stir.
Finally, I use call vcovHAC on the object.
vcovHAC(lmobject)
which gives me the following error
Error in bread. %*% meat. : non-conformable arguments
Does anyone have any solutions?
I might be too late, but I will answer this anyway in case anybody else searches for the same problem. I guess your problem here was having a varest object after estimating your model with the VAR() function. For vcovHAC() no varest object is okay. While for vcovHC it is. If you want to get your HAC covariance matrix you should estimate your VAR Model with dynlm() for example. The lm object is okay for vcovHAC.

How to fit an inverse guassian distribution to my data, preferably using fitdist {fitdistrplus}

I am trying to analyze some Reaction Time data using GLMM. to find a distribution that fits my data best.I used fitdist() for gamma and lognormal distributions. the results showed that lognormal fits my data better.
However, recently i read that the inverse gaussian distribution might be a better fit for reaction time data.
I used nigFitStart to obtain the start values:
library(GeneralizedHyperbolic)
invstrt <- nigFitStart(RTtotal, startValues = "FN")
which gave me this:
$paramStart
mu delta alpha beta
775.953984862 314.662306398 0.007477984 -0.004930604
so i tried using the start parameteres for fitdist:
require(fitdistrplus)
fitinvgauss <- fitdist(RTtotal, "invgauss", start = list(mu=776, delta=314, alpha=0.007, beta=-0.05))
but i get the following error:
Error in checkparamlist(arg_startfix$start.arg, arg_startfix$fix.arg, :
'start' must specify names which are arguments to 'distr'.
i also used ig_fit{goft} and got the following results:
Inverse Gaussian MLE
mu 775.954
lambda 5279.089
so, this time i used these two parameters for the start argument in fitdist and still got the exact same error:
> fitinvgauss <- fitdist(RTtotal, "invgauss", start = list(mu=776, lambda=5279))
Error in checkparamlist(arg_startfix$start.arg, arg_startfix$fix.arg, :
'start' must specify names which are arguments to 'distr'.
someone had mentioned that changing the parametere names from mu and lambda to mean and shape had solved their problem, but i tried it and still got the same error.
Any idea how i can fix this? or could you suggest an alternative way to fit inverse gaussian to my data?
thank you
dput(RTtotal)
c(594.96, 659.5, 706.14, 620.92, 811.05, 420.63, 457.08, 585.53,
488.59, 484.87, 496.72, 769.01, 458.92, 521.76, 889.08, 514.11,
553.09, 564.68, 1057.19, 437.79, 660.33, 639.58, 643.45, 419.47,
469.16, 457.78, 530.58, 538.73, 557.17, 1140.09, 560.03, 543.18,
1093.29, 607.59, 430.2, 712.06, 716.6, 566.69, 989.71, 449.96,
653.22, 556.52, 654.8, 472.54, 600.26, 548.36, 597.51, 471.97,
596.72, 600.29, 706.77, 511.6, 475.89, 599.13, 570.12, 767.57,
402.68, 601.56, 610.02, 891.95, 483.22, 588.78, 505.95, 554.15,
445.54, 489.02, 678.13, 532.06, 652.61, 654.79, 535.08, 1215.66,
633.6, 645.92, 454.37, 535.81, 508.97, 690.78, 685.97, 703.04,
731.99, 592.75, 662.03, 1400.33, 599.73, 1021.34, 1232.35, 855.1,
780.32, 554.4, 1965.77, 841.89, 1262.76, 721.62, 788.95, 1104.24,
1237.4, 1193.04, 513.91, 474.74, 380.56, 570.63, 700.96, 380.89,
481.96, 723.63, 835.22, 781.1, 468.76, 555.1, 522.22, 944.29,
541.06, 559.18, 738.68, 880.58, 500.14, 1856.97, 1001.59, 703.7,
1022.35, 1813.35, 1128.73, 864.75, 1166.77, 1220.4, 776.56, 2073.72,
1223.88, 617, 1387.71, 595.57, 1506.13, 678.41, 1797.87, 2111.04,
1116.61, 1038.6, 894.25, 778.51, 908.51, 1346.69, 989.09, 1334.17,
877.31, 649.31, 978.22, 1276.84, 1001.58, 1049.66, 1131.83, 700.8,
1267.21, 693.52, 1182.3)
So I'm guessing that you failed to tell us that you also have the statmod-package loaded (or perhaps some other package with a 'invgauss'-family including a dinvgauss function). You should be able to tell which package dinvgauss comes from by reading the top line of the help page for the function, So after installing that package and reading the help page (which one should ALWAYS do) for ?dinvgauss:
fitinvgauss <- fitdist(RTtotal, "invgauss",
start = list(mean=776, dispersion=314, shape=1))
fitinvgauss
# --------------
Fitting of the distribution ' invgauss ' by maximum likelihood
Parameters:
estimate Std. Error
mean 779.2535 NA
dispersion -1007.5490 NA
shape 4972.5745 NA
All I did was read the error message and then read the help page and use the correct names for that function's parameters. (And then play around a bit to get the parameter starting values into the feasible range of values.)

'Invalid parent values' error when running JAGS from R

I am running a simple generalized linear model, calling JAGS from R. The model is negatively binomially distributed. The model is being fitted to data on counts of fish, with the majority of individual counts ('C' in the data set below) being zeros.
I initially ran the model with one covariate, temperature ('Temp'). About half of the time the model ran and the other half of the time the model gave me the error, 'Error in node C[###] Invalid parent values.' The value for C[###] in the error message changes with each successive attempt to run the model.
Since my success at running the model was inconsistent, I tried adding another covariate, salinity ('Salt'). Then the model would not run at all, with the same error message as above.
Any ideas or suggestions on the source of the error are greatly appreciated.
I am suspecting that the initial values for the dispersion parameter, r, may be the issue. Ideally I add several more covariates into model fitting if this error can be addressed.
The data set and code are immediately below. For sake of getting the data to load properly on this website, I have omitted 662 of the 672 total values; even with the reduced data set (n = 10 instead of n = 672) the problem remains.
Thank you.
setwd("C:/Users/John/Desktop")
library('coda')
library('rjags')
library('R2jags')
set.seed(1000000000)
#data
n=10
C=c(0,0,0,0,0,1,0,0,0,1)
Temp=c(0,29.3,25.3,28.7,28.7,24.4,25.1,25.1,24.2,23.3)
Salt=c(6,6,0,6,6,0,12,12,6,12)
sink("My Model.txt")
cat("
model {
r~dunif(0,10)
beta0~dunif (-20,20)
beta1~dunif (-20,20)
beta2~dunif (-20,20)
for (i in 1:n) {
C[i] ~ dnegbin(p[i], r)
p[i] <- r/(r+lambda[i])
log(lambda[i]) <- mu[i]
mu[i] <- beta0 + beta1*Temp[i] + beta2*Salt[i]
}
}
", fill=TRUE)
sink()
n=n
C=C
Temp=Temp
Salt=Salt
#bundle data
bugs.data = list(
"n",
"C",
"Temp",
"Salt")
#parameters to monitor
params<-c(
"r",
"beta0",
"beta1",
"beta2")
#initial values
inits <- function(){list(
r=floor(runif(1,0,5)),
beta0=runif(1,-5,5),
beta1=runif(1,-5,5),
beta2=runif(1,-5,5))}
model.file <- 'My Model.txt'
jagsfit <- jags(data=bugs.data, inits=inits, params, n.iter=1000, n.thin=10, n.burnin=100, model.file)
print(jagsfit, digits=5)
This works fine for me most of the time, but it would fail with the error you describe if the inits function samples a value of r of 0 - which you have made more likely by using floor() in the inits function (not sure why you did that - r is not restricted to integers but is strictly positive). Also, every time you run the model you will get different initial values (unless setting a random seed in R) which is making your life more complicated that it needs to be. I generally recommend picking fixed (and probably over dispersed) initial values, such as r=0.01 and r=10 for the two chains in your example.
However, JAGS picks usable initial values for this model as you can see by not providing your own inits e.g.:
library('runjags')
listdata <- lapply(bugs.data, get)
names(listdata) <- unlist(bugs.data)
run.jags(model.file, params, listdata)
I would also have a think about the prior you are using for r - it could well be that this will have a bigger effect on your posterior than intended. Another (not necessarily better) option is something like a gamma prior.
Matt

How to feed data into ode while doing optimisation

I'm new to R. I found very useful code, which I've tried to use for my purposes. however, I get an error:
Error in func(time, state, parms, ...) : object 'k4' not found and Error in func(time, state, parms, ...) : object 'E' not found
I don't know where the problem is as I can see all parameters and data.frame is correct as well.
Thank you everyone for taking time to look at this. I've tried to reduce the number of parameters to3 (k10, k11,k12), and using estimated values for the remaining (embeded values in the code). However, I still get an error message, the E value from data.frame is not passed into rxnrate function and as result ode can't use it. I've tried to use events and forcing functions but it doesn't seem to work. Thank you for spotting P4, it was a typo, should be P, I've corrected already.
Editors note: This was crossposted to Rhelp and that message included the source of this code as a stackoverflow question "r-parameter and initial conditions fitting ODE models with nls.lm."
#set working directory
setwd("~/R/wkspace")
#load libraries
library(ggplot2)
library(reshape2)
library(deSolve)
library(minpack.lm)
time=c(22,23,24,46,47,48)
cE=c(15.92,24.01,25.29,15.92,24.01,25.29)
cP=c(0.3,0.14,0.29,0.3,0.14,0.29)
cL=c(6.13,3.91,38.4,6.13,3.91,38.4)
df<-data.frame(time,cE,cP,cL)
df
names(df)=c("time","cE","cP","cL")
#rate function
rxnrate=function(t,c,parms){
#rate constant passed through a list called
k1=parms$k1
k2=parms$k2
k3=parms$k3
k4=parms$k4
k5=parms$k5
k6=parms$k6
k7=parms$k7
k8=parms$k8
k9=parms$k9
k10=parms$k10
#c is the concentration of species
#derivatives dc/dt are computed below
r=rep(0,length(c))
r[1]=(k1+(k2*E^k10)/(k3^k10+E^k10))/(1+P/k6)-k4* ((1+k5*P)/(1+k7*E))*c["pLH"]; #dRP_LH/dt
r[2]=(1/k8)*k4*((1+k5*P)/(1+k7*E))*c["p"]-k9*c["L"] #dL/dt
return(list(r))
}
ssq=function(myparms){
#initial concentration
cinit=c(pLH=unname(myparms[11]),LH=unname(myparms[12]))
print(cinit)
#time points for which conc is reported
#include the points where data is available
t=c(seq(0,46,2),df$time)
t=sort(unique(t))
#parameters from the parameters estimation
k1=myparms[1]
k2=myparms[2]
k3=myparms[3]
k4=myparms[4]
k5=myparms[5]
k6=myparms[6]
k7=myparms[7]
k8=myparms[8]
k9=myparms[9]
k10=myparms[10]
#solve ODE for a given set of parameters
out=ode(y=cinit,times=t,func=rxnrate,parms=list(k1=k1,k2=k2,k3=k3,k4=k4,k5=k5,k6=k6,k7=k7,k8=k8,k9=k9,k10=k10,E=cE,P=cP))
#Filter data that contains time points
outdf=data.frame(out)
outdf=outdf[outdf$time%in% df$time,]
#Evaluate predicted vs experimental residual
preddf=melt(outdf,id.var="time",variable.name="species",value.name="conc")
expdf=melt(df,id.var="time",variable.name="species",value.name="conc")
ssqres=preddf$conc-expdf$conc
return(ssqres)
}
# parameter fitting using levenberg marquart
#initial guess for parameters
myparms=c(k1=500, k2=4500, k3=200,k4=2.42, k5=0.26,k6=12.2,k7=0.004,k8=55,k9=24,k10=8,pLH=14.5,LH=3.55)
#fitting
fitval=nls.lm(par=myparms,fn=ssq)
#summary of fit
summary(fitval)
#estimated parameter
parest=as.list(coef(fitval))

Estimate parameters of Frechet distribution using mmedist or fitdist(with mme) error

I'm relatively new in R and I would appreciated if you could take a look at the following code. I'm trying to estimate the shape parameter of the Frechet distribution (or inverse weibull) using mmedist (I tried also the fitdist that calls for mmedist) but it seems that I get the following error :
Error in mmedist(data, distname, start = start, fix.arg = fix.arg, ...) :
the empirical moment function must be defined.
The code that I use is the below:
require(actuar)
library(fitdistrplus)
library(MASS)
#values
n=100
scale = 1
shape=3
# simulate a sample
data_fre = rinvweibull(n, shape, scale)
memp=minvweibull(c(1,2), shape=3, rate=1, scale=1)
# estimating the parameters
para_lm = mmedist(data_fre,"invweibull",start=c(shape=3,scale=1),order=c(1,2),memp = "memp")
Please note that I tried many times en-changing the code in order to see if my mistake was in syntax but I always get the same error.
I'm aware of the paradigm in the documentation. I've tried that as well but with no luck. Please note that in order for the method to work the order of the moment must be smaller than the shape parameter (i.e. shape).
The example is the following:
require(actuar)
#simulate a sample
x4 <- rpareto(1000, 6, 2)
#empirical raw moment
memp <- function(x, order)
ifelse(order == 1, mean(x), sum(x^order)/length(x))
#fit
mmedist(x4, "pareto", order=c(1, 2), memp="memp",
start=c(shape=10, scale=10), lower=1, upper=Inf)
Thank you in advance for any help.
You will need to make non-trivial changes to the source of mmedist -- I recommend that you copy out the code, and make your own function foo_mmedist.
The first change you need to make is on line 94 of mmedist:
if (!exists("memp", mode = "function"))
That line checks whether "memp" is a function that exists, as opposed to whether the argument that you have actually passed exists as a function.
if (!exists(as.character(expression(memp)), mode = "function"))
The second, as I have already noted, relates to the fact that the optim routine actually calls funobj which calls DIFF2, which calls (see line 112) the user-supplied memp function, minvweibull in your case with two arguments -- obs, which resolves to data and order, but since minvweibull does not take data as the first argument, this fails.
This is expected, as the help page tells you:
memp A function implementing empirical moments, raw or centered but
has to be consistent with distr argument. This function must have
two arguments : as a first one the numeric vector of the data and as a
second the order of the moment returned by the function.
How can you fix this? Pass the function moment from the moments package. Here is complete code (assuming that you have made the change above, and created a new function called foo_mmedist):
# values
n = 100
scale = 1
shape = 3
# simulate a sample
data_fre = rinvweibull(n, shape, scale)
# estimating the parameters
para_lm = foo_mmedist(data_fre, "invweibull",
start= c(shape=5,scale=2), order=c(1, 2), memp = moment)
You can check that optimization has occurred as expected:
> para_lm$estimate
shape scale
2.490816 1.004128
Note however, that this actually reduces to a crude way of doing overdetermined method of moments, and am not sure that this is theoretically appropriate.

Resources