Error when using midasr package - r

Ive made a Minimal Reproducible Example for the problem i'm facing.
Data for Y(monthly dependent variable):
monthlytest <- c(-.035, 0.455)
ytest <- ts(monthlytest, start=c(2008,8), frequency=12)
Data for X(daily explanatory variable):
lol1 <- paste(2008, sprintf("%02s",rep(1:12, each=30)), sprintf("%02s", 1:30), sep="-") [211:270]
lol2 <- seq(0.015, 0.078, length.out=60)
xtest <- zoo(lol2, order.by = lol1)
Load package:
library(midasr)
library(zoo)
Run regression:
beta <- midas_r(ytest ~ mls(ytest, 1, 1) + mls(xtest, 3:30, 30))
When this final line of code is run I get this error, what am I doing wrong?
Error in matrix(NA, nrow = n - nrow(X), ncol = ncol(X)) :
invalid 'nrow' value (< 0)

The error is produced by the function mls:
> mls(xtest, 3:30, 30)
Erreur dans matrix(NA, nrow = n - nrow(X), ncol = ncol(X)) :
valeur 'nrow' incorrecte (< 0)
This happens because mls expects numeric argument. Converting xtest to numeric solves the problem:
xtn <- as.numeric(xtest)
beta <- midas_r(ytest ~ mls(ytest, 1, 1) + mls(xtn, 3:30, 30))
Which produces error again:
Erreur dans prepmidas_r(y, X, mt, Zenv, cl, args, start, Ofunction, user.gradient, :
l'argument "start" est manquant, avec aucune valeur par défaut
Which means that you did not specify start which is mandatory for function midas_r. Your model is the unrestricted MIDAS model, which means that either you need to use function midas_u or supply start=NULL. But even this does not help:
> beta <- midas_r(ytest ~ mls(ytest, 1, 1) + mls(xtn, 3:30, 30),start=NULL)
Erreur dans midas_r.fit(prepmd) :
Not possible to estimate MIDAS model, more parameters than observations
You have two low frequency observations, which in theory allows you to estimate two parameters, your model has 29. So you need to have at least 30 low frequency observations (since you lose one observation due to lagged dependent variable) to estimate this model.

Related

hts non-conformable arrays forecast

Hi everyone I am trying to calculate the accuracy statistics for Hierarchical Time Series, using the hts package, but I get an error that says "Error in x - fcasts : non-conformable arrays".
library(hts)
abc <- matrix(sample(1:100, 32*140, replace=TRUE), ncol=32)
colnames(abc) <- c(
paste0("A0",1:5),
paste0("B0",1:9),"B10",
paste0("C0",1:8),
paste0("D0",1:5),
paste0("E0",1:4)
)
abc <- ts(abc, start=2019, frequency=365.25/7)
x <- hts(abc, characters = c(1,2))
data <- window(x, start = 2019.000, end = 2021.166)
test <- window(x, start = 2021.185)
fcasts <- forecast(data, h = 20, method = "bu")
accuracy(fcasts, test)
accuracy(fcasts test, levels = 1)
Then the error message is:
> data <- window(x, start = 2019.000, end = 2021.166)
> test <- window(x, start = 2021.185)
> fcasts <- forecast(data, h = 20, method = "bu")
There were 32 warnings (use warnings() to see them)
> accuracy(fcasts, test)
Error in x - fcasts : non-conformable arrays
> accuracy(fcasts, test, levels = 1)
Error in x - fcasts : non-conformable arrays
Thank you
This is a bug in the hts package, which I've now fixed in the dev version (https://github.com/earowang/hts/commit/3f444cf6d6aca23a3a7f2d482df2e33bb078dc55).
Using the CRAN version, the problem is avoided by using the same forecast horizon (h) as the length of the test set.
There was another bug in accuracy() triggered by weekly data which I've also fixed.
I think the problem occurs because of the list object for fcasts and test.
Try this:
accuracy(fcasts$bts, test$bts)
accuracy(fcasts$bts, test$bts, levels = 1)

An NA error in my Gibbs sampler for mixture model

I am working on a Gibbs sampler and my code is as follows. The idea is (1)sample pi first (2) sample delta (3) sample beta.
library(foreign)
cognitive `=read.dta("http://www.stat.columbia.edu/~gelman/arm/examples/child.iq/kidiq.dta")`
summary(cognitive)
cognitive$mom_work = as.numeric(cognitive$mom_work > 1)
cognitive$mom_hs = as.numeric(cognitive$mom_hs > 0)
# Modify column names of the data set
colnames(cognitive) = c("kid_score", "hs", "IQ", "work", "age")
x<-cbind(cognitive$hs, cognitive$IQ, cognitive$work, cognitive$age)
y<-cognitive$kid_score
lmmodel<-lm(y~x-1, data=cognitive)
NSim=3000 #iteration
Betahat=solve(t(x)%*%x)%*%t(x)%*%y
Error in if (delta[ite, j] == 1) rnorm(1, mu1, sigma1) else rnorm(1, mu0, :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In rbinom(1, 1, prob = (p1/(p0 + p1))) : NAs produced
2: In rbinom(1, 1, prob = (p1/(p0 + p1))) : NAs produced
error is caused by the line prob=(pi[ite]*exp(-beta[ite-1,j]^2/(2*10^2)))/(((1-pi[ite])*10^3)*exp(-beta[ite-1,j]^2/(2*10^(-4)))+pi[ite-1]*exp(-beta[ite-1,j]^2/(2*10^2))): at some iteration prob becomes greater than 1, so rbern() returns NA. Check your formula.
UPD. For debugging, add the following before your delta[ite,j]=rbern(... line:
prob_full <- (pi[ite]*exp(-beta[ite-1,j]^2/(2*10^2)))/(((1-pi[ite])*10^3)*exp(-beta[ite-1,j]^2/(2*10^(-4)))+pi[ite-1]*exp(-beta[ite-1,j]^2/(2*10^2)));
cat('\n',ite,j,prob_full)

Correct usage of stats4::mle

I want to use stats4::mle function to estimate the best parameters (2) of a distribution.
I would like to be sure my usage is correct and get guidance to avoid error
"Error in optim(start, f, method = method, hessian = TRUE, ...) :
initial value in 'vmmin' is not finite
In addition: Warning message:
In log(mu) : NaNs produced"
Function I would like to estimate is exp(beta0*a + beta1*b) and I would like to estimate the betas
Sample code:
a <- mydata$a # first variable
b <- mydata$b # second variable
y <- mydata$y # observed result
nll <- function(beta0, beta1) {
mu = y - exp(beta0 * a + beta1 * b)
- sum(log(mu))
}
est <- stats4::mle(minuslog = nll, start = list(beta0 = 0.0001, beta1 = 0.0001))
est
So:
Is this the correct way of doing things?
For the error, I understand this is due to values of mu getting to 0, but I don't know what I can do with it
Thanks for your help.

Fitting and Predicting Arima models in R

The strategy is carried out on a "rolling" basis:
For each day,n, the previous k days of the differenced logarithmic returns of a stock market index are used as a window for fitting an optimal ARIMA models.
#Install relevant packages
install.packages("quantmod")
install.packages("forecast")
#Import the necessary libraries
library(quantmod)
library(forecast)
#Get S&P 500
getSymbols("^GSPC", from = "2000-01-01")
#Compute the daily returns
gspcRet<-(log(Cl(GSPC)))
#Use only the last two years of returns
gspc500<-tail(gspcRet,500)
spReturns<-diff(gspc500)
spReturns[as.character(head(index(Cl(GSPC)),1))] = 0
# Create the forecasts vector to store the predictions
windowLength<- 500
foreLength<-length(spReturns) - windowLength
forecasts <- vector(mode="list", length=foreLength)
fit1 <- vector(mode="list", length=foreLength)
for (d in 0:foreLength) {
# Obtain the S&P500 rolling window for this day
spReturnsOffset<- spReturns[(1+d):(windowLength+d)]
#Searching for the best models
order.matrix<-matrix(0,nrow = 3, ncol = 6 * 2 * 6)
aic.vec<- numeric(6 * 2 * 6)
k<-1
for(p in 0:5) for(d in 0:1) for(q in 0:5){
order.matrix[,k]<-c(p,d,q)
aic.vec[k]<- AIC(Arima( spReturnsOffset, order=c(p,d,q)))
k<-k+1
}
ind<- order(aic.vec,decreasing=F)
aic.vec<- aic.vec[ind]
order.matrix<- order.matrix[,ind]
order.matrix<- t(order.matrix)
result<- cbind(order.matrix,aic.vec)
#colnames(result)<- c("p","d","q","AIC")
p1<- result[1,1]
p2<- result[2,1]
p3<- result[3,1]
p4<- result[4,1]
d1<- result[1,2]
d2<- result[2,2]
d3<- result[3,2]
d4<- result[4,2]
q1<- result[1,3]
q2<- result[2,3]
q3<- result[3,3]
q4<- result[4,3]
#I THINK CODE IS CORRECT TILL HERE PROBLEM IS WITH THE FOLLOWING CODE I GUESS
fit1[d+1]<- Arima(spReturnsOffset, order=c(p1,d1,q1))
forecasts[d+1]<- forecast(fit1,h=1)
#forecasts[d+1]<- unlist(fcast$mean[1])
}
I get the following Error:
Error in x - fits : non-numeric argument to binary operator
In addition: Warning messages:
1: In fit1[d + 1] <- Arima(spReturnsOffset, order = c(p1, d1, q1)) :
number of items to replace is not a multiple of replacement length
2: In mean.default(x, na.rm = TRUE) :
argument is not numeric or logical: returning NA
Can anyone please suggest a fix?

Error using random forest (MICE package) during imputation

I would like to use the method Random Forest to impute missing values. I have read some papers that claim that MICE random Forest perform better than parametric mice.
In my case, I already run a model for the default mice and got the results and played with them. However when I had a option for the method random forest, I got an error and I'm not sure why. I've seen some questions relating to errors with random forest and mice but those are not my cases. My variables have more than a single NA.
imp <- mice(data1, m=70, pred=quickpred(data1), method="pmm", seed=71152, printFlag=TRUE)
impRF <- mice(data1, m=70, pred=quickpred(data1), method="rf", seed=71152, printFlag=TRUE)
iter imp variable
1 1 Vac
Error in if (n == 0) stop("data (x) has 0 rows") : argument is of length zero
Any one has any idea why I'm getting this error?
EDIT
I tried to change all variables to numeric instead of having dummy variables and it returned the same error and some warnings()
impRF <- mice(data, m=70, pred=quickpred(data), method="rf", seed=71152, printFlag=TRUE)
iter imp variable
1 1 Vac CliForm
Error in if (n == 0) stop("data (x) has 0 rows") : argument is of length zero
In addition: There were 50 or more warnings (use warnings() to see the first 50)
50: In randomForest.default(x = xobs, y = yobs, ntree = 1, ... :
The response has five or fewer unique values. Are you sure you want to do regression?
EDIT1
I've tried only with 5 imputations and a smaller subset of the data, with only 2000 rows and got a few different errors:
> imp <- mice(data2, m=5, pred=quickpred(data2), method="rf", seed=71152, printFlag=TRUE)
iter imp variable
1 1 Vac Radio Origin Job Alc Smk Drugs Prison Commu Hmless Symp
Error in randomForest.default(x = xobs, y = yobs, ntree = 1, ...) : NAs in foreign
function call (arg 11)
In addition: Warning messages:
1: In randomForest.default(x = xobs, y = yobs, ntree = 1, ...) : invalid mtry: reset to within valid range
2: In max(ncat) : no non-missing arguments to max; returning -Inf
3: In randomForest.default(x = xobs, y = yobs, ntree = 1, ...) : NAs introduced by coercion
I also encountered this error when I had only one fully observed variable, which I'm guessing is the cause in your case too. My colleague Anoop Shah provided me with a fix (below) and Prof van Buuren (mice's author) has said he will include it in the next update of the package.
In R type the following to enable you to redefine the rf impute function.
fixInNamespace("mice.impute.rf", "mice")
The corrected function to paste in is then:
mice.impute.rf <- function (y, ry, x, ntree = 100, ...){
ntree <- max(1, ntree)
xobs <- as.matrix(x[ry, ])
xmis <- as.matrix(x[!ry, ])
yobs <- y[ry]
onetree <- function(xobs, xmis, yobs, ...) {
fit <- randomForest(x = xobs, y = yobs, ntree = 1, ...)
leafnr <- predict(object = fit, newdata = xobs, nodes = TRUE)
nodes <- predict(object = fit, newdata = xmis, nodes = TRUE)
donor <- lapply(nodes, function(s) yobs[leafnr == s])
return(donor)
}
forest <- sapply(1:ntree, FUN = function(s) onetree(xobs,
xmis, yobs, ...))
impute <- apply(forest, MARGIN = 1, FUN = function(s) sample(unlist(s),
1))
return(impute)
}

Resources