I am relatively new to R. I was using auto.arima and predict to predict a time series data. Here is the code I am using:
train.arima=auto.arima(train, seasonal=F, xreg=NULL)
train.pd=predict(train.arima, n.ahead=numahead, newxreg=NULL)
I am getting this error message, although I already set the xreg and newxreg as NULL
Error in predict.Arima(train.arima, n.ahead = numahead, newxreg = NULL) :
'xreg' and 'newxreg' have different numbers of columns: 1 != 0
Could anyone help please??
Use forecast not predict. auto.arima sometimes selects a drift term which cannot be handled by a simple call to predict.
Related
I am trying to implement ordinal logistic regression on my dataset in r. I use the function 'polr' for this, but cannot seem to find a lot of information regarding its implementation.
The following errors are the ones I'm stuck on:
> dat.polr <- polr(as.factor(relevance)~allterms+idf.title, data=dat.one)
Warning message:
In polr(as.factor(relevance) ~ allterms + idf.title + idf.desc + :
design appears to be rank-deficient, so dropping some coefs
> dat.pred <- predict(dat.polr,dat.test,type="class")
Error in X %*% object$coefficients : non-conformable arguments
I want to train my model to guess the relevance of a new dataset. dat.one is the dataset I'm using to train the data, dat.test is the dataset I'm using to test the data. I believe that the predict variable's error is caused by the warning in polr. However, I have no clue how to resolve this. Any help would be appreciated :)
I am trying to use the gBox() function from the TSA package in R. I want to test the goodness of fit for a GARCH model. But when I try to run the function I get this error message:
Error in filter(M, filter = beta, method = "recursive", sides = 1,
init = rep(sigma2,dims [product 2] do not match the length of object
[1]**
The funny thing is that this is from an exact replica of the example that the package instructions provide, so there really shouldn't be any errors one would think. I get the same error message for my own data as well and I just don't know what to do.
Here is the example code:
library(TSA)
library(tseries)
data(CREF)
r.cref=diff(log(CREF))*100
m1=garch(x=r.cref,order=c(1,1))
summary(m1)
gBox(m1,x=r.cref,method='squared')
The length of the time series r.cref is 500 and the length of the garch m1 is 10, so they're obviously not same length, but how do I fix this error?
I'm building a segmented regression model using R's Segmented package.
I was able to create the model but have trouble using the predict.segmented function. It always throws an error saying "subscript out of bounds"
This is the exact error message:
Error in newdata[[nameZ[i]]] : subscript out of bounds
Traceback just gives this:
1: predict.segmented(seg_model, xtest)
I created a simple case that gives the same error:
require(segmented)
x = c(1:90, 991:1000)
y = c((x[1:10]/2), (x[11:100]*2))
lm_model = lm(y~x)
seg_model = segmented(lm_model, seg.Z=~x, psi=list(x=NA),
control=seg.control(display=FALSE, K=1, random=TRUE))
xtest = c(1:1000)
predict.segmented(seg_model, xtest)
I am starting to think this could be a bug. I'm new to R and not sure how to debug this either. Any help is appreciated!
You are using predict.segemented incorrectly. Like nearly all the predict() functions, your newdata parameter should be a data.frame, not a vector. Also, it needs to have names that match the variables used in your regression. Try
predict.segmented(seg_model, data.frame(x=xtest))
instead. When using a function for the first time, be sure the read the help page (?predict.segmented) to know what the function expects for each of the parameters.
I am doing just a regular logistic regression using the caret package in R. I have a binomial response variable coded 1 or 0 that is called a SALES_FLAG and 140 numeric response variables that I used dummyVars function in R to transform to dummy variables.
data <- dummyVars(~., data = data_2, fullRank=TRUE,sep="_",levelsOnly = FALSE )
dummies<-(predict(data, data_2))
model_data<- as.data.frame(dummies)
This gives me a data frame to work with. All of the variables are numeric. Next I split into training and testing:
trainIndex <- createDataPartition(model_data$SALE_FLAG, p = .80,list = FALSE)
train <- model_data[ trainIndex,]
test <- model_data[-trainIndex,]
Time to train my model using the train function:
model <- train(SALE_FLAG~. data=train,method = "glm")
Everything runs nice and I get a model. But when I run the predict function it does not give me what I need:
predict(model, newdata =test,type="prob")
and I get an ERROR:
Error in dimnames(out)[[2]] <- modelFit$obsLevels :
length of 'dimnames' [2] not equal to array extent
On the other hand when I replace "prob" with "raw" for type inside of the predict function I get prediction but I need probabilities so I can code them into binary variable given my threshold.
Not sure why this happens. I did the same thing without using the caret package and it worked how it should:
model2 <- glm(SALE_FLAG ~ ., family = binomial(logit), data = train)
predict(model2, newdata =test, type="response")
I spend some time looking at this but not sure what is going on and it seems very weird to me. I have tried many variations of the train function meaning I didn't use the formula and used X and Y. I used method = 'bayesglm' as well to check and id gave me the same error. I hope someone can help me out. I don't need to use it since the train function to get what I need but caret package is a good package with lots of tools and I would like to be able to figure this out.
Show us str(train) and str(test). I suspect the outcome variable is numeric, which makes train think that you are doing regression. That should also be apparent from printing model. Make it a factor if you want to do classification.
Max
I'm getting the error message that "Type of predictors in new data do not match that of the training data."
This confuses me, since I am able to get the same dat sets working under rpart and ctree. These functions conveniently enough report which factors are causing the bug, so it's easy to debug. Right now I'm not sure which factors in my many dimensions are causing problems.
Is there a simple way to know which columns/variables are throwing randomForest off?
For what it's worth:
> write.csv(predict(object=train_comp.rp, newdata = test_w_age, type = c("prob")), file="test_predict_rp_w_age.csv")
> write.csv(predict(object=train_comp.rf, newdata = test_w_age, type = c("prob")), file="test_predict_rf_w_age.csv")
Error in predict.randomForest(object = train_comp.rf, newdata = test_w_age, : Type of predictors in new data do not match that of the training data.