passing sparse xreg to stlf in R causes optimisation error - r

I am trying to forecast a time series, and regress on temperature. The residuals show a different behaviour at low and high temperatures so I want to use piecewise linear approach, so learn different coeffecients for temperatures above and below 35 degrees.
The data is in a dataframe data$x, data$Season, data$Temp.
#Create data frame
len<-365*3 + 1 +31
x<-rnorm(len,mean=4000000,sd=100000)
Season<-c(rep(3,62),rep(4,91),rep(1,90),rep(2,92),rep(3,92),rep(4,91),rep(1,90),rep(2,92),rep(3,92),rep(4,91),rep(1,91),rep(2,92),rep(3,61))
Temp<-rnorm(len,mean=20,sd=5)
data<-data.frame(x,Season,Temp)
#Create model matrix
season_dummy<-model.matrix(~as.factor(data$Season)+0)
Temp_max=pmax(0,data$Temp-35) # creates 0, or a difference
Temp_restore<-restore_temp_up(Temp_max,data$Temp,35) # restores difference to original value
Temp_season_matrix_max=Temp_restore * season_dummy
#Create time-series and forecast
data_ts<-ts(data$x[1:1000],freq=365,start=c(2009,182))
len_train<-length(data_ts)
xreg1<-Temp_season_matrix_max[1:len_train,]
newxreg1<-Temp_season_matrix_max[(len_train+1):(len_train+30),]
stlf(data_ts,method="arima",h=30,xreg=xreg1,newxreg=newxreg1,s.window="periodic")
> Error in optim(init[mask], armaCSS, method = optim.method, hessian = FALSE, :
non-finite value supplied by optim
Error in auto.arima(x, xreg = xreg, seasonal = FALSE, ...) :
No suitable ARIMA model found
In addition: Warning message:
In auto.arima(x, xreg = xreg, seasonal = FALSE, ...) :
Unable to calculate AIC offset
>
Other threads suggest changing method solver from CSS to ML, but I cant edit these parameters in stlf. The help file shows an optional parameter "forecastfunction" but there are no examples of real explanation how to use it.
Note - when I set the min temperature to say 20, instead of 35, this works ok - I am sure it is because the xreg matrix containing temperatures above 35 degress is sparse (most temperatures are below this value), but I am not sure how to get around this.
(I have included code for restore_temp_up - possibly inefficient, but included here for question completion.)
restore_temp_up<-function(x,original,k){
if(!is.vector(x))
stop('x must be a vector')
for (i in 1:length(x)){
if(!is.na(x[i])){
if (x[i] > 0){
x[i]<-x[i]+k
}
if (original[i] == k){
x[i]<-original[i] ## this is the case if original WAS =k, then dont know whether original is 0,
}
}
}
return(x)
}

Your design matrix is rank deficient so the regression is singular. To see this:
> eigen(t(xreg1) %*% xreg1)$val
[1] 1321.223 0.000 0.000 0.000
You cannot fit a regression model with a rank deficient design matrix.

Related

R non-linear model fitting using fitModel function

I want to fit a non-linear model to a real data.
The real data consists of 2 known numerical vectors ; thickness as 'x' and fh as 'y'
thickness=seq(0.15,2.00,by=0.05)
fh = c(5.17641, 4.20461, 3.31091, 2.60899, 2.23541, 1.97771, 1.88141, 1.62821, 1.50138, 1.51075, 1.40850, 1.26222, 1.09432, 1.13202, 1.12918, 1.10355, 1.11867, 1.09740,1.08324, 1.05687, 1.19422, 1.22984, 1.34516, 1.19713,1.25398 ,1.29885, 1.33658, 1.31166, 1.40332, 1.39550,1.37855, 1.41491, 1.59549, 1.56027, 1.63925, 1.72440, 1.74192, 1.82049)
plot(thickness,fh)
This is apparently non-linear. So, I am trying to fit this model as a non-linear function of
y= x*2/3+(2+2*a)/(3*x)
Variable a is an unknown constant and I am trying to find the best constant a that minimizes the sum of square of error between the regression line and the real data.
I first used a function fitModel that I found on a YouTube video, Fitting Functions to Data in R.
library(TIMP)
f=fitModel(fh~thickness^2/3+(2+2*A)/(3*thickness)) #it finds the coefficient 'A'
coef(f) # to represent just the coefficient
However, there's an error
Error in modelspec[[datasetind[i]]] : subscript out of bounds
So, as an alternative, want to find a plot of 'a' and 'the Sum of Squares of Error'. This time, I have such a hard time finding 'a' and plotting this graph. By manual work, I figured out the value 'a' is somewhere near 0.2 but this is not a precise value.
It would be helpful if someone could manifest either:
Why the fitModel function didn't work or
How to find the value a and plot the graph.
You could try this instead:
yf = function(a,xv) xv*(2/3)+(2+2*a)/(3*xv)
yf(2,thickness)
f <- function (a,y, xv) sum((y - yf(a,xv))^2)
f(2,fh,thickness)
xmin <- optimize(f, c(0, 10), tol = 0.0001, y=fh,xv=thickness)
xmin
plot(thickness,fh)
lines(thickness,yf(xmin$minimum,thickness),col=3)

Model fitting with nls.lm in R, "Error: unused argument"

I'm trying to use the nls.lm function in the minpack.lm to fit a non-linear model to some data from a psychophysics experiment.
I've had a search around and can't find a lot of information about the package so have essentially copied the format of the example given on the nls.lm help page. Unfortunately my script is still failing to run and R is throwing out this error:
Error in fn(par, ...) :
unused argument (observed = c(0.1429, 0.2857, 0.375, 0.3846, 0.4667, 0.6154))
It appears that the script thinks the data I want to fit the model to is irrelevant, which is definitely wrong.
I'm expecting it to fit the model and produce a value of 0.5403 for the spare parameter (w).
Any help is greatly appreciated.
I'm making the transfer from Matlab over to R so apologies if my code looks sloppy.
Here's the script.
install.packages("pracma")
require(pracma)
install.packages("minpack.lm")
require(minpack.lm)
# Residual function, uses parameter w (e.g. .23) to predict accuracy error at a given ratio [e.g. 2:1]
residFun=function(w,n) .5 * erfc( abs(n[,1]-n[,2])/ ((sqrt(2)*w) * sqrt( (n[,1]^2) + (n[,2]^2) ) ) )
# example for residFun
# calculates an error rate of 2.59%
a=matrix(c(2,1),1,byrow=TRUE)
residFun(.23,a)
# Initial guess for parameter to be fitted (w)
parStart=list(w=0.2)
# Recorded accuracies in matrix, 1- gives errors to input into residFun
# i.e. the y-values I want to fit the model
Acc=1-(matrix(c(0.8571,0.7143,0.6250,0.6154,0.5333,0.3846),ncol=6))
# Ratios (converted to proportions) used in testing
# i.e. the points along the x-axis to fit the above data to
Ratios=matrix(c(0.3,0.7,0.4,0.6,0.42,0.58,0.45,0.55,0.47,0.53,0.49,0.51),nrow=6,byrow=TRUE)
# non-linear model fitting, attempting to calculate the value of w using the Levenberg-Marquardt nonlinear least-squares algorithm
output=nls.lm(par=parStart,fn=residFun,observed=Acc,n=Ratios)
# Error message shown after running
# Error in fn(par, ...) :
# unused argument (observed = c(0.1429, 0.2857, 0.375, 0.3846, 0.4667, 0.6154))
The error means you passed a function an argument that it did not expect. ?nls.lm has no argument observed, so it is passed to the function passed to fn, in your case, residFun. However, residFun doesn't expect this argument either, hence the error. You need to redefine this function like this :
# Residual function, uses parameter w (e.g. .23) to predict accuracy error at a given ratio [e.g. 2:1]
residFun=function(par,observed, n) {
w <- par$w
r <- observed - (.5 * erfc( abs(n[,1]-n[,2])/ ((sqrt(2)*w) * sqrt( (n[,1]^2) + (n[,2]^2) ) ) ))
return(r)
}
It gives the following result :
> output = nls.lm(par=parStart,fn=residFun,observed=Acc,n=Ratios)
> output
Nonlinear regression via the Levenberg-Marquardt algorithm
parameter estimates: 0.540285874836135
residual sum-of-squares: 0.02166
reason terminated: Relative error in the sum of squares is at most `ftol'.
Why that happened :
It seems that you were inspired by this example in he documentation :
## residual function
residFun <- function(p, observed, xx) observed - getPred(p,xx)
## starting values for parameters
parStart <- list(a=3,b=-.001, c=1)
## perform fit
nls.out <- nls.lm(par=parStart, fn = residFun, observed = simDNoisy,
xx = x, control = nls.lm.control(nprint=1))
Note that observed is an argument of residFun here.

How to fit AR process (with nonconsecutive lags) to Time Series?

I want to estimate the coefficients for an AR process based on weekly data where the lags occur at t-1, t-52, and t-53. I will naturally lose a year of data to do this.
I currently tried:
lags <- rep(0,54)
lags[1]<- NA
lags[52] <- NA
lags[53] <- NA
testResults <- arima(data,order=c(53,0,0),fixed=lags)
Basically I tried using an ARIMA and shutting off the MA/differencing. I used 0's for the terms I wanted to exclude (plus intercept, and NAs for the terms I wanted.
I get the following error:
Error in optim(init[mask], armafn, method = optim.method, hessian =TRUE, :
non-finite finite-difference value [1]
In addition: Warning message:
In arima(data, order = c(53, 0, 0), fixed = lags) :
some AR parameters were fixed: setting transform.pars = FALSE
I'm hoping there is an easier method or potential solution to this error. I want to avoid creating columns with the lagged variables and simply running a regression. Thanks!

Error in summary quantreg backsolve

When I run a quantile regression in R, using the quantreg package, and then I run summary(quantregObject), I get this error message:
Error in base::backsolve(r, x, k = k, upper.tri = upper.tri, transpose = transpose, : singular matrix in 'backsolve'. First zero in diagonal [1]
Any suggestion how could I fix this problem?
In short, try:
summary(quantregObject, se = "iid")
which puts a strong assumption on your residuals. Or if you need accuracy use a boot strap to get the standard errors:
summary(quantregObject, se = "boot")
If you call summary on a an object returned by quantreg:rq
summary(quantregObject)
This will call summary.rq.
From ?summary.rq.
You can see that there are 4 options to compute the standard errors (se).
Depending on the sample size (N < 1000) the default is se = "rank" or se = "nid".
"nid" does something which sounds complicated and may then yield a diagonal matrix which backsolve cannot handle.

Bootstrapping for nlme model

I am trying to use bootstrapping to derive errors around my parameter estimate for the fixed effects in the following model. It is simply estimating the number of times an animal might cross a road based on the road's distance from a stream.
When I run the nlme model it does converge and all is well. I have tried several different methods to do the bootstrapping but have been unsuccessful. I have tried both using the boot package and simply developing a bit of code that resamples my data and drops the parameter estimates into new vectors.
Below is my attempt at the later and the resulting error messages. Any help would be greatly appreciated.
bv <- numeric(100)
cv <- numeric(100)
for(i in 1:100){
ss <- sample(1:130, replace=T)
y <- nwfcross[ss]
x <- nwfdist[ss]
modelb <- nlme(y~a*exp(-b*x), fixed=a+b~1,
random=a+b~1|nwfid, start=c(a=300,b=0.016))
bv[i] <- coef(modelb)[1]
cv[i] <- coef(modelb)[2]
}
Error in nlme.formula(y ~ a * exp(-b * x), fixed = a + b ~ 1, random = a + :
Maximum number of iterations reached without convergence
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Warning messages:
1: Singular precision matrix in level -1, block 1
2: Singular precision matrix in level -1, block 1
3: Singular precision matrix in level -1, block 1….

Resources