xgboost with tree_method = 'hist' in R - r

According to a benchmarking of GBM vs. xgboost vs. LightGBM (https://www.kaggle.com/nschneider/gbm-vs-xgboost-vs-lightgbm) it is possible to implement xgboost with the argument
tree_method = 'hist'
in R.
However doing so gives me always an error:
Error in xgb.iter.update(bst$handle, dtrain, iteration - 1, obj) :
Invalid Input: 'hist', valid values are: {'approx', 'auto', 'exact'}
What am I missing?

Related

Unused argument error when building a Confusion Matrix in R

I am currently trying to run Logistic Regression model on my DF.
While I was creating a new modelframe with the actual and predicted values i get get the following error message.
Error
Error in confusionMatrix(as.factor(log_class), lgtest$Satisfaction, positive = "satisfied") :
unused argument (positive = "satisfied")
This is my model:
#### Logistic regression model
log_model = glm(Satisfaction~., data = lgtrain, family = "binomial")
summary(log_model)
log_preds = predict(log_model, lgtest[,1:22], type = "response")
head(log_preds)
log_class = array(c(99))
for (i in 1:length(log_preds)){
if(log_preds[i]>0.5){
log_class[i]="satisfied"}else{log_class[i]="neutral or dissatisfied"}}
### Creating a new modelframe containing the actual and predicted values.
log_result = data.frame(Actual = lgtest$Satisfaction, Prediction = log_class)
lgtest$Satisfaction = factor(lgtest$Satisfaction, c(1,0),labels=c("satisfied","neutral or dissatisfied"))
lgtest
confusionMatrix(log_class, log_preds, threshold = 0.5) ####this works
mr1 = confusionMatrix(as.factor(log_class),lgtest$Satisfaction, positive = "satisfied") ## this is the line that causes the error
I had same problem. I typed "?confusionMatrix" and take this output:
Help on topic 'confusionMatrix' was found in the following packages:
confusionMatrix
(in package InformationValue in library /home/beyza/R/x86_64-pc-linux-gnu-library/3.6)
Create a confusion matrix
(in package caret in library /home/beyza/R/x86_64-pc-linux-gnu-library/3.6)
Confusion Matrix
(in package ModelMetrics in library /home/beyza/R/x86_64-pc-linux-gnu-library/3.6)
As we can understand from here, since it is in more than one package, we need to specify which package we want to use.
So I typed code with "caret::confusionMatrix(...)" and it worked!
This is how we can write the code to get rid of argument error when building a confusion matrix in R
caret::confusionMatrix(
data = new_tree_predict$predicted,
reference = new_tree_predict$actual,
positive = "True"
)

GARCH Optimization in R

I'm trying to fit a GARCH(1,1) model using rugarch library in R to predict S&P500 returns for the next 25 days. I have hard coded the ARIMA p and q to be 4 and 4 respectively for this example. However, each time I run this, it comes up as an optimization error.
Here is the code:
getSymbols("^GSPC", from="2016-01-01")
asset = diff(log(Cl(GSPC)))
returns = as.numeric(asset)
spec = ugarchspec(
variance.model=list(garchOrder=c(1,1)),
mean.model=list(armaOrder=c(
4,4
), include.mean=T),
distribution.model="sged"
)
fit = ugarchfit(
spec, returns, solver = 'hybrid')
and here is the error:
Error in optim(init[mask], armaCSS, method = optim.method, hessian =
TRUE, :
initial value in 'vmmin' is not finite

r - Error message while using h2o.deeplearning

ERROR MESSAGE:
Illegal argument(s) for DeepLearning model: dl_model_faster.
Details: ERRR on field: _stopping_metric: Stopping metric cannot be misclassification for regression.
I am getting this error but actually I am using h2o.deeplearning for a classification problem, I don't want to run regression model. How can I specify that?
I had the same error for h2o.deeplearning(). Converting the dependent variable to factor and then feeding the data to h2o.deeplearning() fixed it for me.
dataset$dependent_variable= factor(dataset$dependent_variable,levels = c(0, 1), labels = c(0, 1))

What does this error mean while running the ksvm of kernlab package in R

I am calling the ksvm method of the kernlab package in R using the following syntax
svmFit = ksvm(x=solTrainXtrans, y=solTrainYSVM, kernel="stringdot", kpar="automatic", C=1, epsilon=0.1)
The x parameter is a data.frame with feature values and the y parameter is a list with various values.
I get the following error while run the above line.
Error in do.call(kernel, kpar) : second argument must be a list
What is it trying to tell me here?
Try setting kpar = list(length = 4, lambda = 0.5)
Does it help?

Caret and GBM Errors

I am attempting to use the caret package in R for several nested cross-validation processes with user-defined performance metrics. I have had all kinds of problems, so I pulled back to see see if there were issues with a more out of the box use of caret and it seems I have run into one.
If I run the following:
install.packages("caret")
install.packages("gbm")
library(caret)
library(gbm)
data(GermanCredit)
GermanCredit$Class<-ifelse(GermanCredit$Class=='Bad',1,0)
gbmGrid <- expand.grid(.interaction.depth = 1,
.n.trees = 150,
.shrinkage = 0.1)
gbmMOD <- train(Class~., data=GermanCredit
,method = "gbm",
tuneGrid= gbmGrid,
distribution="bernoulli",
bag.fraction = 0.5,
train.fraction = 0.5,
n.minobsinnode = 10,
cv.folds = 1,
keep.data=TRUE,
verbose=TRUE
)
I get the error (or similar):
Error in { :
task 1 failed - "arguments imply differing number of rows: 619, 381"
with warnings:
1: In eval(expr, envir, enclos) :
model fit failed for Resample01: interaction.depth=1, n.trees=150, shrinkage=0.1
But, if I run just the gbm routine everything finishes fine.
gbm1 <- gbm(Class~., data=GermanCredit,
distribution="bernoulli",
n.trees=150, # number of trees
shrinkage=0.10,
interaction.depth=1,
bag.fraction = 0.5,
train.fraction = 0.5,
n.minobsinnode = 10,
cv.folds = 1,
keep.data=TRUE,
verbose=TRUE
)
There were two issues: passing cv.folds caused a problem. Also, you don't need to convert the outcome to a binary number; this causes train to think that it is a regression problem. The idea behind the train function is to smooth out the inconsistencies with the modeling functions, so we use factors for classification and numbers for regression.
Just for note - although this issue has been caused by the reason described in the answer, the error message (given below) may also occur with older version of caret and gbm. I encountered this error and after spending a lot of time trying to figure out what the issue was it turned out that I had to upgrade to the most recent version of caret (5.17-7) and gbm (2.1-0.1). These are the most recent version as of today on CRAN.
Error in { :
task 1 failed - "arguments imply differing number of rows: ...

Resources