Inter-rater agreement (Fleiss' Kappa, Krippendorff's Alpha etc) Java API? - math

I am working on building a Question Classification/Answering corpus as a part of my masters thesis. I'm looking at evaluating my expected answer type taxonomy with respect to inter-rater agreement/reliability, and I was wondering: Does anybody know of any decent (preferably free) Java API(s) that can do this?
I'm reasonably certain all I need is Fleiss' Kappa and Krippendorff's Alpha at this point.
Weka provides a kappa statistic in it's evaluation package, but I think it can only evaluate a classifier and I'm not at that stage yet (because I'm still building the data set and classes).
Thanks.

I ported a matlab implementation of Fleiss' kappa to Python/numpy.
http://code.google.com/p/hydrat/source/browse/src/hydrat/common/fleiss.py
It is not difficult to implement, perhaps you could port it to Java yourself.

Check QDAP (Pittsburg University) Open Source code.

I couldn't find an existing Java API in time for my research, so I ended up implementing both Fleiss' Kappa and Krippendorff's Alpha myself. Preliminary results for our research can be found in this paper.

Related

mlr3 optimized average of ensemble

I try to optimize the averaged prediction of two logistic regressions in a classification task using a superlearner.
My measure of interest is classif.auc
The mlr3 help file tells me (?mlr_learners_avg)
Predictions are averaged using weights (in order of appearance in the
data) which are optimized using nonlinear optimization from the
package "nloptr" for a measure provided in measure (defaults to
classif.acc for LearnerClassifAvg and regr.mse for LearnerRegrAvg).
Learned weights can be obtained from $model. Using non-linear
optimization is implemented in the SuperLearner R package. For a more
detailed analysis the reader is referred to LeDell (2015).
I have two questions regarding this information:
When I look at the source code I think LearnerClassifAvg$new() defaults to "classif.ce", is that true?
I think I could set it to classif.auc with param_set$values <- list(measure="classif.auc",optimizer="nloptr",log_level="warn")
The help file refers to the SuperLearner package and LeDell 2015. As I understand it correctly, the proposed "AUC-Maximizing Ensembles through Metalearning" solution from the paper above is, however, not impelemented in mlr3? Or do I miss something? Could this solution be applied in mlr3? In the mlr3 book I found a paragraph regarding calling an external optimization function, would that be possible for SuperLearner?
As far as I understand it, LeDell2015 proposes and evaluate a general strategy that optimizes AUC as a black-box function by learning optimal weights. They do not really propose a best strategy or any concrete defaults so I looked into the defaults of the SuperLearner package's AUC optimization strategy.
Assuming I understood the paper correctly:
The LearnerClassifAvg basically implements what is proposed in LeDell2015 namely, it optimizes the weights for any metric using non-linear optimization. LeDell2015 focus on the special case of optimizing AUC. As you rightly pointed out, by setting the measure to "classif.auc" you get a meta-learner that optimizes AUC. The default with respect to which optimization routine is used deviates between mlr3pipelines and the SuperLearner package, where we use NLOPT_LN_COBYLA and SuperLearner ... uses the Nelder-Mead method via the optim function to minimize rank loss (from the documentation).
So in order to get exactly the same behaviour, you would need to implement a Nelder-Mead bbotk::Optimizer similar to here that simply wraps stats::optim with method Nelder-Mead and carefully compare settings and stopping criteria. I am fairly confident that NLOPT_LN_COBYLA delivers somewhat comparable results, LeDell2015 has a comparison of the different optimizers for further reference.
Thanks for spotting the error in the documentation. I agree, that the description is a little unclear and I will try to improve this!

Difference between "mlp" and "mlpML"

I'm using the Caret package from R to create prediction models for maximum energy demand. What i need to use is neural network multilayer perceptron, but in the Caret package i found out there's 2 of the mlp method, which is "mlp" and "mlpML". what is the difference between the two?
I have read description from a book (Advanced R Statistical Programming and Data Models: Analysis, Machine Learning, and Visualization) but it still doesnt answer my question.
Caret has 238 different models available! However many of them are just different methods to call the same basic algorithm.
Besides mlp there are 9 other methods of calling a multi-layer-perceptron one of which is mlpML. The real difference is only in the parameters of the function call and which model you need depends on your use case and what you want to adapt about the basic model.
Chances are, if you don't know what mlpML or mlpWeightDecay,etc. does you are fine to just use the basic mlp.
Looking at the official documentation we can see that:
mlp(size) while mlpML(layer1,layer2,layer3) so in the first method you can only tune the size of the multi-layer-perceptron while in the second call you can tune each layer individually.
Looking at the source code here:
https://github.com/topepo/caret/blob/master/models/files/mlp.R
and here:
https://github.com/topepo/caret/blob/master/models/files/mlpML.R
It seems that the difference is that mlpML allows several hidden layers:
modelInfo <- list(label = "Multi-Layer Perceptron, with multiple layers",
while mlp has one single layer with hidden units.
The official documentation also hints at this difference. In my opinion, it is not particularly useful to have many different models that differ only very slightly, and the documentation does not explain those slight differences well.

Minbucket and weights in rpart

A couple questions for the rpart and party experts.
1) I am trying to understand the difference of the control parameter "minbucket" in rpart and party. Is it correct that minbucket in rpart is unweighted (even if weights are provided to fit the tree)?
2) Can anyone briefly describe how the weights are used in the rpart algorithm? I tried to download and review the source code, but I couldn't make much sense of it being a newbie. rpart calls a C function (C_rpart), which seems to be the main part of rpart, but I couldn't find more information about it.
Thanks so much in advance.
The weights parameter in rpart (and in most other machine learning algorithms) can be considered to be exactly equivalent to duplicating those training items that many times. A weight of 5 is the same as having that line repeated 5 times. You can explicitly create this using some simple code, provided that your data set is small enough:
data[rep(1:nrow(data),times=data$weights),]

adaptive sampling implementation in julia

I would like to implement an adaptive sampling algorithm in Julia, in n-dimensions, for plotting and numerical integration purposes. As a starting point I found:
https://mathematica.stackexchange.com/questions/216/adaptive-sampling-for-slow-to-compute-functions-in-2d
As I am quite new to Julia, any help would be much appreciated. First of all, is any of this functionality already implemented in existing libraries? I mean, anything I could use as a starting base?
PS: I plan to update this thread as I progress with the programming.
The julia package Mamba has a few samplers implemented:
Adaptive Mixture Metropolis (AMM)
Adaptive Metropolis within Gibbs (AMWG)
Missing Values Sampler (MISS)
No-U-Turn Sampler (NUTS)
Shrinkage Slice (Slice)
See documentation for details:
http://mambajl.readthedocs.org/en/latest/index.html

Implementation of Particle Swarm Optimization Algorithm in R

I'm checking a simple moving average crossing strategy in R. Instead of running a huge simulation over the 2 dimenional parameter space (length of short term moving average, length of long term moving average), I'd like to implement the Particle Swarm Optimization algorithm to find the optimal parameter values. I've been browsing through the web and was reading that this algorithm was very effective. Moreover, the way the algorithm works fascinates me...
Does anybody of you guys have experience with implementing this algorithm in R? Are there useful packages that can be used?
Thanks a lot for your comments.
Martin
Well, there is a package available on CRAN called pso, and indeed it is a particle swarm optimizer (PSO).
I recommend this package.
It is under actively development (last update 22 Sep 2010) and is consistent with the reference implementation for PSO. In addition, the package includes functions for diagnostics and plotting results.
It certainly appears to be a sophisticated package yet the main function interface (the function psoptim) is straightforward--just pass in a few parameters that describe your problem domain, and a cost function.
More precisely, the key arguments to pass in when you call psoptim:
dimensions of the problem, as a vector
(par);
lower and upper bounds for each
variable (lower, upper); and
a cost function (fn)
There are other parameters in the psoptim method signature; those are generally related to convergence criteria and the like).
Are there any other PSO implementations in R?
There is an R Package called ppso for (parallel PSO). It is available on R-Forge. I do not know anything about this package; i have downloaded it and skimmed the documentation, but that's it.
Beyond those two, none that i am aware of. About three months ago, I looked for R implementations of the more popular meta-heuristics. This is the only pso implementation i am aware of. The R bindings to the Gnu Scientific Library GSL) has a simulated annealing algorithm, but none of the biologically inspired meta-heuristics.
The other place to look is of course the CRAN Task View for Optimization. I did not find another PSO implementation other than what i've recited here, though there are quite a few packages listed there and most of them i did not check other than looking at the name and one-sentence summary.

Resources