How to use Plotmo for plotting an arima object - plot

I would like to use plotmo instruction from Plotmo package to plot an arima object I estimate arima model with a matrix of explanatory variables X ( transfer function)
arima.model<-arima(y,c(3,1,3),xreg=X)
When plotting this object I have the next error:
plotmo(arima.model) stats::predict(Arima.object, data.frame[3,1], type="response")
Error in predict.Arima(list(coef = c(0, 0, 0.426819838403672, -0.23337107002535, : 'xreg' and 'newxreg' have different numbers of columns
How could I fix this problem? Thanks C

Plotmo
isn't really meant for time-series models like arima models
and doesn't support them.
However, if you just want to plot the fitted model and some future
values, the following function will do it (there may be a simpler
way using the ts.plot function):
plarima <- function(ts, ..., n.ahead=1, main=deparse(substitute(ts)))
{
model <- arima(ts, ...)
if(!inherits(model, "Arima"))
stop("this function requires 'arima' from the standard stats package")
# calculations so we can extend the x axis
n <- length(ts)
x <- xy.coords(ts)$x
if(any(is.na(x)))
stop("NA in time")
xdelta <- (x[n] - x[1]) / n
plot(ts + model$residuals, # plot the fit in gray
xlim=c(x[1], x[n] + xdelta * n.ahead),
main=main, col="gray", lwd=3)
lines(ts) # plot the data
# predict n.ahead values and plot them in red
forecast <- predict(model, n.ahead=n.ahead)
lines(x=x[n] + xdelta * (0:n.ahead), y=c(ts[n], forecast$pred), col=2)
legend("topleft", legend=c("data", "fitted", "forecast"),
col=c(1, "gray", 2), lwd=c(1,3,1), lty=1, bg="white")
model # return the arima model
}
For example
plarima(lh, order=c(3,0,0), n.ahead=10)
plarima(USAccDeaths, order=c(0,1,1), seas=list(order=c(0,1,1)), n.ahead=10)
gives the following plots
(I'm assuming you are using the arima function from the
standard stats package. I think the forecast package also
has an arima function.)

Related

Why abline won't show line from glm with Gamma family?

I have the following data, which I'm trying to model via GLM, using Gamma function. It works, except that abline won't show any line. What am I doing wrong?
y <- c(0.00904977380111,0.009174311972687,0.022573363475789,0.081632653008122,0.005571030584803,1e-04,0.02375296916921,0.004962779106823,0.013729977117333,0.00904977380111,0.004514672640982,0.016528925619835,1e-04,0.027855153258277,0.011834319585449,0.024999999936719,1e-04,0.026809651528869,0.016348773841071,1e-04,0.009345794439034,0.00457665899303,0.004705882305772,0.023201856194357,1e-04,0.033734939711656,0.014251781472007,0.004662004755245,0.009259259166667,0.056872037917387,0.018518518611111,0.014598540145986,0.009478673032951,0.023529411811211,0.004819277060357,0.018691588737881,0.018957345923721,0.005390835525461,0.056179775223141,0.016348773841071,0.01104972381185,0.010928961639344,1e-04,1e-04,0.010869565271444,0.011363636420778,0.016085790883856,0.016,0.005665722322786,0.01117318441372,0.028818443860841,1e-04,0.022988505862069,0.01010101,1e-04,0.018083182676638,0.00904977380111,0.00961538466323,0.005390835525461,0.005763688703004,1e-04,0.005571030584803,1e-04,0.014388489208633,0.005633802760722,0.005633802760722,1e-04,0.005361930241431,0.005698005811966,0.013986013986014,1e-04,1e-04)
x <- c(600,600,600,600,600,600,600,600,600,600,600,600,600,600,600,600,600,600,600,600,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,744.47,744.47,744.47,744.47,744.47,744.47,744.47,630.42,630.42,630.42,630.42,630.42,630.42,630.42,630.42,630.42)
hist(y,breaks=15)
plot(y~x)
fit <- glm(y~x,family='Gamma'(link='log'))
abline(fit)
abline plots linear functions, from a simple linear regression, say. A GLM with a Gamma family and a log link is nonlinear on the original scale. To visualize the fit of such a model, you could use predict (an example is given below). Several packages (e.g. effects or visreg) for R exist that feature functions that allow you to directly plot the fit on the original scale including confidence intervals.
Here is an example using visreg using your data and model:
library(visreg)
y <- c(0.00904977380111,0.009174311972687,0.022573363475789,0.081632653008122,0.005571030584803,1e-04,0.02375296916921,0.004962779106823,0.013729977117333,0.00904977380111,0.004514672640982,0.016528925619835,1e-04,0.027855153258277,0.011834319585449,0.024999999936719,1e-04,0.026809651528869,0.016348773841071,1e-04,0.009345794439034,0.00457665899303,0.004705882305772,0.023201856194357,1e-04,0.033734939711656,0.014251781472007,0.004662004755245,0.009259259166667,0.056872037917387,0.018518518611111,0.014598540145986,0.009478673032951,0.023529411811211,0.004819277060357,0.018691588737881,0.018957345923721,0.005390835525461,0.056179775223141,0.016348773841071,0.01104972381185,0.010928961639344,1e-04,1e-04,0.010869565271444,0.011363636420778,0.016085790883856,0.016,0.005665722322786,0.01117318441372,0.028818443860841,1e-04,0.022988505862069,0.01010101,1e-04,0.018083182676638,0.00904977380111,0.00961538466323,0.005390835525461,0.005763688703004,1e-04,0.005571030584803,1e-04,0.014388489208633,0.005633802760722,0.005633802760722,1e-04,0.005361930241431,0.005698005811966,0.013986013986014,1e-04,1e-04)
x <- c(600,600,600,600,600,600,600,600,600,600,600,600,600,600,600,600,600,600,600,600,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,3500,744.47,744.47,744.47,744.47,744.47,744.47,744.47,630.42,630.42,630.42,630.42,630.42,630.42,630.42,630.42,630.42)
fit <- glm(y~x,family='Gamma'(link='log'))
visreg(fit, scale = "response")
An here is the example using R base graphics and predict:
pred_frame <- data.frame(
x = seq(min(x), max(x), length.out = 1000)
)
pred_frame$fit <- predict(fit, newdata = pred_frame, type = "response")
plot(y~x, pch = 16, las = 1, cex = 1.5)
lines(fit~x, data = pred_frame, col = "steelblue", lwd = 3)
You are not being consistent here since you chose to model on the log scale but you are plotting on the raw scale. Mind you many, many published plots do the same. You need to plot the points in log space or transform the coefficients and pass them to abline() explicitly.

get equation of linear SVM regression line

I'm struggling to find a way to get the equation for a linear SVM model in the regression case, since most of the questions deal with classification... I have fit it with caret package.
1- univariate case
set.seed(1)
fit=train(mpg~hp, data=mtcars, method="svmLinear")
plot(x=mtcars$hp, y=predict(fit, mtcars), pch=15)
points(x=mtcars$hp, y=mtcars$mpg, col="red")
abline(lm(mpg~hp, mtcars), col="blue")
Which gives the plot with red=actual, black=fitted, and blue line is classic regression. In this case I know I could manually calculate the SVM prediction line from 2 points, but is there a way to get directly the equation from the model structure? I actually need the equation like this y=a+bx (here mpg=?+?*hp ) with values in the original scale.
2-multivariate
same question but with 2 dependent variables (mpg~hp+wt)
Thanks,
Yes, I believe there is. Take a look at this answer, which is similar, but does not use the caret library. If you add svp = fit$finalModelto the example, you should be able to follow it almost exactly. I applied a similar technique to your data below. I scaled the data to fit nicely on the plot of the vectors since the library scales the data at runtime.
require(caret)
set.seed(1)
x = model.matrix(data=mtcars, mpg ~ scale(hp)) #set up data
y = mtcars$mpg
fit=train(x, y, method="svmLinear") #train
svp = fit$finalModel #extract s4 model object
plot(x, xlab="", ylab="")
w <- colSums(coef(svp)[[1]] * x[unlist(alphaindex(svp)),])
b <- b(svp)
abline(b/w[1],-w[2]/w[1], col='red')
abline((b+1)/w[1],-w[2]/w[1],lty=2, col='red')
abline((b-1)/w[1],-w[2]/w[1],lty=2, col='red')
And your second question:
x = model.matrix(data=mtcars, mpg ~ scale(hp) + scale(wt) - 1) #set up data
fit=train(x, y, method="svmLinear") #train
svp = fit$finalModel #extract s4 model object
plot(x, xlab="", ylab="")
w <- colSums(coef(svp)[[1]] * x[unlist(alphaindex(svp)),])
b <- b(svp)
abline(b/w[1],-w[2]/w[1], col='red')
abline((b+1)/w[1],-w[2]/w[1],lty=2, col='red')
abline((b-1)/w[1],-w[2]/w[1],lty=2, col='red')
Edit
The above answer concerns plotting a boundary, not the linear SVM regression line. To answer the question, one easy way to get the line is to extract the predicted values and plot the regression. You actually only need a couple of points to get the line, but for simplicity, I used the following code.
abline(lm(predict(fit, newdata=mtcars) ~ mtcars$hp), col='green')
or
abline(lm(predict(fit) ~ mtcars$hp), col='green')

`gam` package: extra shift spotted when sketching data on `plot.gam`

I try to fit a GAM using the gam package (I know mgcv is more flexible, but I need to use gam here). I now have the problem that the model looks good, but in comparison with the original data it seems to be offset along the y-axis by a constant value, for which I cannot figure out where this comes from.
This code reproduces the problem:
library(gam)
data(gam.data)
x <- gam.data$x
y <- gam.data$y
fit <- gam(y ~ s(x,6))
fit$coefficients
#(Intercept) s(x, 6)
# 1.921819 -2.318771
plot(fit, ylim = range(y))
points(x, y)
points(x, y -1.921819, col=2)
legend("topright", pch=1, col=1:2, legend=c("Original", "Minus intercept"))
Chambers, J. M. and Hastie, T. J. (1993) Statistical Models in S (Chapman & Hall) shows that there should not be an offset, and this is also intuitively correct (the smooth should describe the data).
I noticed something comparable in mgcv, which can be solved by providing the shift parameter with the intercept value of the model (because the smooth is seemingly centred). I thought the same could be true here, so I subtracted the intercept from the original data-points. However, the plot above shows this idea wrong. I don't know where the extra shift comes from. I hope someone here may be able to help me.
(R version. 3.3.1; gam version 1.12)
I think I should first explain various output in the fitted GAM model:
library(gam)
data(gam.data)
x <- gam.data$x
y <- gam.data$y
fit <-gam(y ~ s(x,6), model = FALSE)
## coefficients for parametric part
## this includes intercept and null space of spline
beta <- coef(fit)
## null space of spline smooth (a linear term, just `x`)
nullspace <- fit$smooth.frame[,1]
nullspace - x ## all 0
## smooth space that are penalized
## note, the backfitting procedure guarantees that this is centred
pensmooth <- fit$smooth[,1]
sum(pensmooth) ## centred
# [1] 5.89806e-17
## estimated smooth function (null space + penalized space)
smooth <- nullspace * beta[2] + pensmooth
## centred smooth function (this is what `plot.gam` is going to plot)
c0 <- mean(smooth)
censmooth <- smooth - c0
## additive predictors (this is just fitted values in Gaussian case)
addpred <- beta[1] + smooth
You can first verify that addpred is what fit$additive.predictors gives, and since we are fitting additive models with Gaussian response, this is also as same as fit$fitted.values.
What plot.gam does, is to plot censmooth:
plot.gam(fit, col = 4, ylim = c(-1.5,1.5))
points(x, censmooth, col = "gray")
Remember, there is
addpred = beta[0] + censmooth + c0
If you want to shift original data y to match this plot, you not only need to subtract intercept (beta[0]), but also c0 from y:
points(x, y - beta[1] - c0)

Predict out of sample using flexsurvreg in R

I have the following model in R
library(flexsurv)
data(ovarian)
model = flexsurvreg(Surv(futime, fustat) ~ ecog.ps + rx, data = ovarian, dist='weibull')
model
predict(model,data = ovarian, type = 'response')
The model summary looks like this flexsurvreg model output
I am trying to predict the survival time using the predict function in R and get the following error
error while trying to predict
How can I predict expected lifetime using this flexsurvreg model?
I understand that the documentation mentions a totlos.fs function, but this data does not seem to have a trans variable that totlos.fs requires to provide an output.
If there is no other alternative to totlos.fs how can I create a trans variable in this data and handle it along with existing covariates?
Please advise.
Section 3 of the supplementary examples doc for the flexsurv documentation has an example in which the predicted values are calculated directly using the model equation. As you are using the Weibull distribution (with n=2 parameters) I believe this should work:
pred.model <- model.matrix(model) %*% model$res[-(1:n),"est"]
Cheers
Nik,
I know your question is an old one, but see below how I hacked a way to do it. It involves retrieving the shape and rate parameters from your fit of test data, then instead of predict, you use the qgompertz() from flexsurv. Please excuse the use of my own encapsulated example code, but you should be able to follow along.
# generate the training data "lung1" from data(lung) in survival package
# hacked way for truncating the lung data to 2 years of follow up
require(survival)
lung$yrs <- lung$time/365
lung1 <- lung[c("status", "yrs")]
lung1$status[ lung1$yrs >2] <- 1
lung1$yrs[ lung1$yrs >2] <- 2
# from the training data build KM to obtain survival %s
s <- Surv(time=lung1$yrs, event=lung1$status)
km.lung <- survfit(s ~ 1, data=lung1)
plot(km.lung)
# generate dataframe to use later for plotting
cut.length <- sum((km.lung$time <= 2)) # so I can create example test data
test.data <- data.frame(yrs = km.lung$time[1:cut.length] , surv=round(km.lung$surv[1:cut.length], 3))
##
## doing the same as above with gompertz
##
require(flexsurv) #needed to run gompertz model
s <- Surv(time=lung1$yrs, event=lung1$status)
gomp <- flexsurvreg(s ~ 1, data=lung1, dist="gompertz") # run this to get shape and rate estimates for gompertz
gomp # notice the shape and rate values
# create variables for these values
g.shape <- 0.5866
g.rate <- 0.5816
##
## plot data and vizualize the gomperts
##
# vars for plotting
df1 <- test.data
xvar <- "yrs"
yvar <- "surv"
extendedtime <- 3 #
ylim1 <- c(0,1)
xlim1 <- c(0, extendedtime)
# plot the survival % for training data
plot(df1[,yvar]~df1[,xvar], type="S", ylab="", xlab="", lwd=3, xlim=xlim1, ylim=ylim1)
# Nik--here is where the magic happens... pay special attention to: qgompertz(seq(.01,.99,by=.01), shape=0.58656, rate = .5816)
lines (qgompertz(seq(.01,.99,by=.01), shape=0.58656, rate = .5816) , seq(.99,.01,by=-.01) , col="red", lwd=2, lty=2 )
# generate a km curve from the testing data
s <- Surv(time=lung$yrs, event=lung$status)
km.lung <- survfit(s ~ 1, data=lung)
par(new=T)
# now draw remaining survival curve from the testing section
plot(km.lung$surv[(cut.length+1):length(km.lung$time)]~km.lung$time[(cut.length+1):length(km.lung$time)], type="S", col="blue", ylab="", xlab="", lwd=3, xlim=xlim1, ylim=ylim1)

How to get predicted values on Sigmoid Growth model in R

I am trying to forecast future revenue by using Sigmoid growth model in R. The model is like this:
Y = a/( 1+ce^(-bX) ) + noise
my code:
x <- seq(-5,5,length=n)
y <- 1/(1+exp(-x))
plot(y~x, type='l', lwd=3)
title(main='Sigmoid Growth')
I could draw the plot, but I don't know how to get the future values. Suppose I want to predict the next 6 years revenue values.
Make y a function, and plot that (plot has special support for functions):
y <- function(x) 1/(1+exp(-x))
plot(y,-5,11,type="l",lwd=3)

Resources