Trees in R: regression vs classification - r

I am using the tree library in R, but when I fit the data into the tree command, sometimes I get a regression tree and sometimes a classification tree. What is this about? Thanks!

From the help page (?tree)
The left-hand-side (response) should be either a numerical vector when
a regression tree will be fitted or a factor, when a classification
tree is produced.

Related

lift chart R with glm model or multi class classification

Is it possible to create a lift chart for glm models in R ? I know it is more meant for binary classification model but my idea was to cut the target variable into ten quantiles and assess if the predictions fall into the right quantiles which would make a classification model in a way. However I only find info for binary classification lift chart so I was wondering if a function also exist for multiclassification or if I need to write one myself.

Is there any way to dredge a PGLMM_compare model in R (Phyr and MuMin packages)?

I am doing a comparative analysis, and my response variables are 0 or 1, therefore I need to do a phylogenetically-corrected analysis with a binomial error distribution. I used the PGLMM_compare function from the phyr package (https://rdrr.io/github/daijiang/phyr/man/pglmm_compare.html) to create a full model with all of my variables, but MuMin does not support this output as a 'global model', therefore I cannot dredge it. I am looking for a way to find the best models and possibly perform model averaging from these, however it seems that these packages are not compatible. It would be difficult to create all the models by hand, since I have ~8 explanatory variables. Is there any way of dredging a phylogenetic model with binomial error structure? Thanks in advance.
You would need to implement at least the following methods for dredge and model.avg to work with pglmm_compare:
nobs.pglmm_compare(object, ...)
logLik.pglmm_compare(object, ...)
coef.pglmm_compare(object, ...)
coefTable.pglmm_compare(model, ...)

SVM for data prediction in R

I'd like to use the 'e1071' library for fitting an SVM model. So far, I've made a model that creates a curve regression based on the data set.
(take a look at the purple curve):
However, I want the SVM model to "follow" the data, such that the prediction for each value is as close as possible to the actual data. I think this is possible because of this graph that shows how SVMs (model 2) model are similar to an ARIMA model (model 1):
I tried changing the kernel to no avail. Any help will be much appreciated.
Fine tuning a SVM classifier is no easy task. Have you considered other models? For ex. GAM's (generalized additive models)? These work well on very curvy data.

Obtaining the Linear Regression Model at each Leaf for M5P model

I am trying to figure how to get the linear model at each leaf of a tree generated by M5P method in RWeka library in R as an output to text file so that I can write a separate look up calculator program (say in Excel for non-R Users).
I am using
library (RWeka)
model = M5P (response ~ predictorA+predictorB, data=train).
I can get the tree output as model$classifier in a matrix. This works great thanks to This post
If I give the command:
model
R prints the model$classifier (the tree structure), followed by the LM at each leaf, I want to extract the coefficients of LM at each leaf.
Using the following code: I am able to get the LM coefficients out of R.
library(rJava)
ModelTree=as.matrix(scan(text=.jcall(model$classifier, "S","toString") ,sep="\n", what="") )[-c(1:2, 6), ,drop=FALSE]

can we get probabilities the same way that we get them in logistic regression through random forest?

I have a data structure with binary 0-1 variable (click & Purchase; click & not-purchase) against a vector of the attributes. I used logistic regression to get the probabilities of the purchase. How can I use Random Forest to get the same probabilities? Is it by using Random Forest regression? or is it Random Forest classification with type='prob' in R which gives the probability of categorical variable?
It won't give you the same result since the structure of the two method are different. Logistic regression is given by a definitive linear specification, where RF is a collective vote from multiple independent/random trees. If specification and input feature are properly tuned for both, they can produce comparable results. Here is the major difference between the two:
RF will give more robust fit against noise, outliers, overfitting or multicollinearity etc which are common pitfalls in regression type of solution. Basically if you don't know or don't want to know much about whats going in with the input data, RF is a good start.
logistic regression will be good if you know expertly about the data and how to properly specify the equation. Or somehow want to engineer how the fit/prediction works. The explicit form of GLM specification will allow you to do that.

Resources