I want to build a Bagged Logistic Regression Model in R. My dataset is really biased and has 0.007% of positive occurrences.
My thoughts to solve this was to use Bagged Logistic Regression. I came across the hybridEnsemble package in R. Does anyone have an example of how this package can be used? I searched online, but unfortunately did not find any examples.
Any help will be appreciated.
The way that I would try to solve this is use the h2o.stackedEnsemble() function in the h2o R package. You can automatically create more balanced classifiers by using the balance_classes = TRUE option in all of the base learners. More information about how to use this function to create ensembles is located in the Stacked Ensemble H2O docs.
Also, using H2O will be way faster than anything that's written in native R.
Related
Hello stackoverflow community,
Im working on a uni-project in which we try to create Bayesian Network Classifier from data in R.
Ideally the classifier should be based on a General Bayesian Network (GNB) or a BN Augmented Naive Bayes(BAN).
Unfortunately Im yet to find a suitabel package to create either of those nets in R.
My research led me to the following two packages:
bnclassify, the most prominent package for BN classification, doesnt include GNBs or BANs at all.
bnlearn offers the possibility to learn GNBs but according to the creator the learning is focused on returning the correct dependence structure rather than maximizing the predictive accuracy for classification. I've tried to use them for my classification problem nonetheless but the result was underwhelming.
So my question is if anyone knows a R package to classify with GNBs or BANs
OR how to work with the GNBs fron bnlearn to improve their predictive accuracy for classification problems.
Thanks you for your help in advance.
Best Regards
I'm surprised at the number of R neural network packages that don't appear to have a parameter for regularization/lambda/weight decay. I'm assuming I'm missing something obvious. When I use a package like MLR and look at the integrated learners, I don't see parameters for regularization.
For example: nnTrain from the deepnet package:
list of params
I see parameters for just about everything - even drop out - but not lambda or anything else that looks like regularization.
My understanding of both caret and mlr is that they basically organize other ML packages and try to provide a consistent way to interact with them. I'm not finding L1/L2 regularization in any of them.
I've also done 20 google searches looking for R packages with regularization but found nothing. What am I missing? Thanks!
I looked through more of the models within mlr, (a daunting task), and eventually found the h2o package learners. In mlr, the classif.h2o.deeplearning model has every parameter I could think of, including L1 and L2.
Installing h2o is as simple as:
install.packages('h2o')
In the caret package, which ensemble models can be used for multi class classification?
Also on trying some of the functions mentioned in http://topepo.github.io/caret/Ensemble_Model.html it is giving:
Not in caret's built-in library.
There are no suggestions of relevant packages for many functions on Google either. Could anyone kindly help me out with both these questions?
Most of them can (assuming that they are not solely regression models). We've listed the exclusions here
Here you can see an overview that also lists packages needed.
I'm really confused about regression models and functions in R. Here is my problem. I'm using the PLS package to make a model like Y~x. To do that I have to use 'plsr':
model=plsr(Y~X,ncomp=10,data=df1,center=TRUE, scale=TRUE, validation="LOO")
I couldn't find the source of the 'plsr' in the PLS source code but in the help document it says it refers to 'mvr{pls}' which I could find it. first is 'plsr' a function or model.. in the R terminology? is it built in R? and how does it refer to 'mvr' function in pls package?
Thanks
I was wondering if the functionality given by Weka of building Model trees like M5P which has regression models in the leaves is possible in R. I know there is a way to handle it using the RWeka package. What was somehow strange to me is that the functionality does not exist in other R packages like rpart. The only way to get a "Model Tree" is using the Rweka package?
Thanks for clarification.
Please check cubist and CORElearn packages.