Hi everyone I am trying to calculate the accuracy statistics for Hierarchical Time Series, using the hts package, but I get an error that says "Error in x - fcasts : non-conformable arrays".
library(hts)
abc <- matrix(sample(1:100, 32*140, replace=TRUE), ncol=32)
colnames(abc) <- c(
paste0("A0",1:5),
paste0("B0",1:9),"B10",
paste0("C0",1:8),
paste0("D0",1:5),
paste0("E0",1:4)
)
abc <- ts(abc, start=2019, frequency=365.25/7)
x <- hts(abc, characters = c(1,2))
data <- window(x, start = 2019.000, end = 2021.166)
test <- window(x, start = 2021.185)
fcasts <- forecast(data, h = 20, method = "bu")
accuracy(fcasts, test)
accuracy(fcasts test, levels = 1)
Then the error message is:
> data <- window(x, start = 2019.000, end = 2021.166)
> test <- window(x, start = 2021.185)
> fcasts <- forecast(data, h = 20, method = "bu")
There were 32 warnings (use warnings() to see them)
> accuracy(fcasts, test)
Error in x - fcasts : non-conformable arrays
> accuracy(fcasts, test, levels = 1)
Error in x - fcasts : non-conformable arrays
Thank you
This is a bug in the hts package, which I've now fixed in the dev version (https://github.com/earowang/hts/commit/3f444cf6d6aca23a3a7f2d482df2e33bb078dc55).
Using the CRAN version, the problem is avoided by using the same forecast horizon (h) as the length of the test set.
There was another bug in accuracy() triggered by weekly data which I've also fixed.
I think the problem occurs because of the list object for fcasts and test.
Try this:
accuracy(fcasts$bts, test$bts)
accuracy(fcasts$bts, test$bts, levels = 1)
Related
I am trying to run ridge/lasso with the glmnetand onehot package and getting an error.
library(glmnet)
library(onehot)
set.seed(123)
Sample <- HouseData[1:1460, ]
smp_size <- floor(0.5 * nrow(Sample))
train_ind <- sample(seq_len(nrow(Sample)), size = smp_size)
train <- Sample[train_ind, ]
test <- Sample[-train_ind, ]
############Ridge & Lasso Regressions ################
# Define the response for the training + test set
y_train <- train$SalePrice
y_test <- test$SalePrice
# Define the x training and test
x_train <- train[,!names(train)=="SalePrice"]
x_test <- test[,!names(train)=="SalePrice"]
str(y_train)
## encoding information for training set
x_train_encoded_data_info <- onehot(x_train,stringsAsFactors = TRUE, max_levels = 50)
x_train_matrix <- (predict(x_train_encoded_data_info,x_train))
x_train_matrix <- as.matrix(x_train_matrix)
# create encoding information for x test
x_test_encoded_data_info <- onehot(x_test,stringsAsFactors = TRUE, max_levels = 50)
x_test_matrix <- (predict(x_test_encoded_data_info,x_test))
str(x_train_matrix)
###Calculate best lambda
cv.out <- cv.glmnet(x_train_matrix, y_train,
alpha = 0, nlambda = 100,
lambda.min.ratio = 0.0001)
best.lambda <- cv.out$lambda.min
best.lambda
model <- glmnet(x_train_matrix, y_train, alpha = 0, lambda = best.lambda)
results_ridge <- predict(model,newx=x_test_matrix)
I know my data is clean and my matrices are the same size, But I keep getting this error when I try to run my prediction.
Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'as.matrix': Cholmod error 'X and/or Y have wrong dimensions' at file ../MatrixOps/cholmod_sdmult.c, line 90
My professor has also told me to one-hot encode before I split my data, but that makes no sense to me.
It's hard to debug that specific error because it's not entirely clear where the onehot function in your code is coming from; it doesn't exist in base R or the glmnet package.
That said, I would recommend using the old built-in standby function model.matrix (or its sparse cousin, sparse.model.matrix, if you have larger datasets) for creating the x argument to glmnet. model.matrix will automatically one-hot encode factor or categorical variables for you. It requires a model formula as input, which you can create from your dataset as shown below.
# create the model formula
y_variable <- "SalePrice"
model_formula <- as.formula(paste(y_variable, "~",
paste(names(train)[names(train) != y_variable], collapse = "+")))
# test & train matrices
x_train_matrix <- model.matrix(model_formula, data = train)[, -1]
x_test_matrix <- model.matrix(model_formula, data = test)[, -1]
###Calculate best lambda
cv.out <- cv.glmnet(x_train_matrix, y_train,
alpha = 0, nlambda = 100,
lambda.min.ratio = 0.0001)
A second, newer option would be to use the built-in glmnet function makeX(), which builds matrices off of your test/train dataframes. This can just be fed into cv.glmnet as the x argument as below.
## option 2: use glmnet built in function to create x matrices
x_matrices <- glmnet::makeX(train = train[, !names(train) == "SalePrice"],
test = test[, !names(test) == "SalePrice"])
###Calculate best lambda
cv.out <- cv.glmnet(x_matrices$x, y_train,
alpha = 0, nlambda = 100,
lambda.min.ratio = 0.0001)
Following the model-based recursive partitioning in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6015941/ I want to replicate the following code:
sim_data <- function(n=2000){
x1 <- rnorm(n)
x2 <- rbinom(n,1,0.3)
x3 <- runif(n)
x4 <- rnorm(n)
t <- rbinom(n,1,0.5)
z <- 1-x2+x1+2*(x1>=0)*x2*t-2*(x1<0)*x2*t
pr <- 1/(1+exp(-z))
y <- as.factor(rbinom(n,1,pr))
data.frame(x1,x3,x2=as.factor(x2),x4, t=factor(t,labels=c("C","A")),y,z)
}
dt <- sim_data()
dt.num = as.data.frame(sapply(dt, as.numeric))
dt.num$y <- dt.num$y-1 #only to convert outcome 1,2 into 0,1
mbase <- glm(y~t, data=dt.num,
family = binomial())
round(summary(mbase)$coefficients,3)
library("model4you")
pmtr <- pmtree(mbase, zformula = ~. ,
data = dt.num,
control = ctree_control(minbucket = 250))
plot(pmtr, terminal_panel = node_pmterminal(pmtr,
plotfun = binomial_glm_plot,
confint = TRUE))
However, the following inexplicable error occurs:
Error in .Call.graphics(C_palette2, .Call(C_palette2, NULL)) :
invalid graphics state
I was looking for a solution to this problem in the post Persistent invalid graphics state error when using ggplot2. But the problem persists.
Any clue?
Thank you in advance
When I tried to replicate this, I got a different error:
plot(pmtr, terminal_panel = node_pmterminal(pmtr, plotfun = binomial_glm_plot, confint = TRUE))
## Waiting for profiling to be done...
## Error in plotfun(mod = list(coefficients = c(`(Intercept)` = -0.16839363929017, :
## Plotting currently only works for models with a single factor covariate.
## We recommend using partykit or ggparty plotting functionalities!
The reason for this is that the panel function expects both the response and the treatment to be binary factors (as in dt). When you use binary numeric variables instead (as in dt.num) the model estimation in glm() leads to equivalent output but the plot() functionality is confused.
When I refit both the glm() and the pmtree() with dt rather than dt.num everything works as intended for me, yielding the following graphic:
I have price data for an asset. I want to fit a Markow Switching model (with 2 states). The code I have run is below. Price is configured as numeric and date as a date. Not sure where I'm going wrong.
library(MSwM)
# Loading required package: parallel
library(ggplot2)
nstates <- 6
olsPrice <- lm(PriceUSD~date, Priced)
msmPrice <- msmFit(olsPrice, k = nstates, sw = c(FALSE, TRUE, TRUE))
The error message I get is:
Error in w * matrix(resid(modaux), ncol = k, byrow = T)^2 :
non-conformable arrays
Ive made a Minimal Reproducible Example for the problem i'm facing.
Data for Y(monthly dependent variable):
monthlytest <- c(-.035, 0.455)
ytest <- ts(monthlytest, start=c(2008,8), frequency=12)
Data for X(daily explanatory variable):
lol1 <- paste(2008, sprintf("%02s",rep(1:12, each=30)), sprintf("%02s", 1:30), sep="-") [211:270]
lol2 <- seq(0.015, 0.078, length.out=60)
xtest <- zoo(lol2, order.by = lol1)
Load package:
library(midasr)
library(zoo)
Run regression:
beta <- midas_r(ytest ~ mls(ytest, 1, 1) + mls(xtest, 3:30, 30))
When this final line of code is run I get this error, what am I doing wrong?
Error in matrix(NA, nrow = n - nrow(X), ncol = ncol(X)) :
invalid 'nrow' value (< 0)
The error is produced by the function mls:
> mls(xtest, 3:30, 30)
Erreur dans matrix(NA, nrow = n - nrow(X), ncol = ncol(X)) :
valeur 'nrow' incorrecte (< 0)
This happens because mls expects numeric argument. Converting xtest to numeric solves the problem:
xtn <- as.numeric(xtest)
beta <- midas_r(ytest ~ mls(ytest, 1, 1) + mls(xtn, 3:30, 30))
Which produces error again:
Erreur dans prepmidas_r(y, X, mt, Zenv, cl, args, start, Ofunction, user.gradient, :
l'argument "start" est manquant, avec aucune valeur par défaut
Which means that you did not specify start which is mandatory for function midas_r. Your model is the unrestricted MIDAS model, which means that either you need to use function midas_u or supply start=NULL. But even this does not help:
> beta <- midas_r(ytest ~ mls(ytest, 1, 1) + mls(xtn, 3:30, 30),start=NULL)
Erreur dans midas_r.fit(prepmd) :
Not possible to estimate MIDAS model, more parameters than observations
You have two low frequency observations, which in theory allows you to estimate two parameters, your model has 29. So you need to have at least 30 low frequency observations (since you lose one observation due to lagged dependent variable) to estimate this model.
I'm testing the kernlab package in a regression problem. It seems it's a common issue to get 'Error in .local(object, ...) : test vector does not match model ! when passing the ksvm object to the predict function. However I just found answers to classification problems or custom kernels that are not applicable to my problem (I'm using a built-in one for regression). I'm running out of ideas here, my sample code is:
data <- matrix(rnorm(200*10),200,10)
tr <- data[1:150,]
ts <- data[151:200,]
mod <- ksvm(x = tr[,-1],
y = tr[,1],
kernel = "rbfdot", type = 'nu-svr',
kpar = "automatic", C = 60, cross = 3)
pred <- predict(mod,
ts
)
You forgot to remove the y variable in the test set, and so it fails because the number of predictors don't match. This will work:
predict(mod,ts[,-1])
You can use pred <- predict(mod, ts) if ts is a dataframe.
It would be
data <- setNames(data.frame(matrix(rnorm(200*10),200,10)),
c("Y",paste("X", 1:9, sep = "")))
tr <- data[1:150,]
ts <- data[151:200,]
mod <- ksvm(as.formula("Y ~ ."), data = tr,
kernel = "rbfdot", type = 'nu-svr',
kpar = "automatic", C = 60, cross = 3)
pred <- predict(mod, ts)