Is it possible to create a lift chart for glm models in R ? I know it is more meant for binary classification model but my idea was to cut the target variable into ten quantiles and assess if the predictions fall into the right quantiles which would make a classification model in a way. However I only find info for binary classification lift chart so I was wondering if a function also exist for multiclassification or if I need to write one myself.
Related
In order to strengthen the interpretation of an interaction term I would like to create an interaction plot.
Starting Point: I am analyzing a panel data frame with which I fitted a feasible generalized least squares model by using the panelAR function. It includes an interaction term of two continuous variables.
What I want to do: To create an interaction plot, e.g. following the style of “plot_model” from the package sjPlot (see Three-Way-Interactions: link).
Problem: I could neither find any package which supports the type of my model nor a different way to get a plot.
Question: Is there any workaround which can be used for obtaining an interaction plot or even a package which supports a panelAR model?
Since I am quite new to R I would appreciate every kind of help. Thank you very much
I'm trying to use gbm in R to create a boosting classification tree model for my data.
The problem is that I'm trying to classify my data into multiple labels and the only classification distribution I can find for gbm ("bernoulli") only works for binary classification.
Is there some change that I could make to my code to create a model which classifies the data into more than just two classes?
boost=gbm(label~., data=training, distribution="bernoulli",
n.trees=5000,
interaction.depth=4)
Try
distribution = "multinomial"
Notice that, despite the fact that the option does not seem to be available in the documentation of gbm, it is available indeed - check the example on top of page 30 of the pdf manual, where gbm with distribution = "multinomial" is used with the 3-class iris dataset.
I'd like to use the 'e1071' library for fitting an SVM model. So far, I've made a model that creates a curve regression based on the data set.
(take a look at the purple curve):
However, I want the SVM model to "follow" the data, such that the prediction for each value is as close as possible to the actual data. I think this is possible because of this graph that shows how SVMs (model 2) model are similar to an ARIMA model (model 1):
I tried changing the kernel to no avail. Any help will be much appreciated.
Fine tuning a SVM classifier is no easy task. Have you considered other models? For ex. GAM's (generalized additive models)? These work well on very curvy data.
I was referring to this link which explains the usage of ROCR package for plotting ROC curves and other related accuracy measurement metrics. The author mentions about logistic regression in the beginning, but do these functions(prediction, performance from ROCR) apply to other classification algorithms like SVM, Decision Trees, etc. ?
I tried using prediction() function with results of my SVM model, but it threw a format error despite the arguments being of same type and dimensions. Also I am not sure that if we try coming up with ROC curves for these algorithms, we would get a shape like the one we generally see with logistic regression(like this).
The prediction and performance functions are model-agnostic in the sense that they only require the user to input actual and predicted values from a binary classifier. (More precisely, this is what prediction requires, and performance takes as input an object output by prediction). Therefore, you should be able to use both functions for any classification algorithm that can output this information - including both logistic regression and SVM.
Having said this, model predictions can come in different formats (e.g., propensity scores vs. classes; results stored as numeric vs. factor), and you'll need to ensure that what you feed into prediction is appropriate. This can be quite specific, for example, while the predictions argument can represent continuous or class information, it can’t be a factor. See the “details” section of the function’s help file for more information.
I've been using the caret package in R to run some boosted regression tree and random forest models and am hoping to generate prediction intervals for a set of new cases using the inbuilt cross-validation routine.
The trainControl function allows you to save the hold-out predictions at each of the n-folds, but I'm wondering whether unknown cases can also be predicted at each fold using the built-in functions, or whether I need to use a separate loop to build the models n-times.
Any advice much appreciated
Check the R package quantregForest, available at CRAN. It can easily calculate prediction intervals for random forest models. There's a nice paper by the author of the package, explaining the backgrounds of the method. (Sorry, I can't say anything about prediction intervals for BRT models; I'm looking for them by myself...)