I would like to implement an adaptive sampling algorithm in Julia, in n-dimensions, for plotting and numerical integration purposes. As a starting point I found:
https://mathematica.stackexchange.com/questions/216/adaptive-sampling-for-slow-to-compute-functions-in-2d
As I am quite new to Julia, any help would be much appreciated. First of all, is any of this functionality already implemented in existing libraries? I mean, anything I could use as a starting base?
PS: I plan to update this thread as I progress with the programming.
The julia package Mamba has a few samplers implemented:
Adaptive Mixture Metropolis (AMM)
Adaptive Metropolis within Gibbs (AMWG)
Missing Values Sampler (MISS)
No-U-Turn Sampler (NUTS)
Shrinkage Slice (Slice)
See documentation for details:
http://mambajl.readthedocs.org/en/latest/index.html
Related
I try to optimize the averaged prediction of two logistic regressions in a classification task using a superlearner.
My measure of interest is classif.auc
The mlr3 help file tells me (?mlr_learners_avg)
Predictions are averaged using weights (in order of appearance in the
data) which are optimized using nonlinear optimization from the
package "nloptr" for a measure provided in measure (defaults to
classif.acc for LearnerClassifAvg and regr.mse for LearnerRegrAvg).
Learned weights can be obtained from $model. Using non-linear
optimization is implemented in the SuperLearner R package. For a more
detailed analysis the reader is referred to LeDell (2015).
I have two questions regarding this information:
When I look at the source code I think LearnerClassifAvg$new() defaults to "classif.ce", is that true?
I think I could set it to classif.auc with param_set$values <- list(measure="classif.auc",optimizer="nloptr",log_level="warn")
The help file refers to the SuperLearner package and LeDell 2015. As I understand it correctly, the proposed "AUC-Maximizing Ensembles through Metalearning" solution from the paper above is, however, not impelemented in mlr3? Or do I miss something? Could this solution be applied in mlr3? In the mlr3 book I found a paragraph regarding calling an external optimization function, would that be possible for SuperLearner?
As far as I understand it, LeDell2015 proposes and evaluate a general strategy that optimizes AUC as a black-box function by learning optimal weights. They do not really propose a best strategy or any concrete defaults so I looked into the defaults of the SuperLearner package's AUC optimization strategy.
Assuming I understood the paper correctly:
The LearnerClassifAvg basically implements what is proposed in LeDell2015 namely, it optimizes the weights for any metric using non-linear optimization. LeDell2015 focus on the special case of optimizing AUC. As you rightly pointed out, by setting the measure to "classif.auc" you get a meta-learner that optimizes AUC. The default with respect to which optimization routine is used deviates between mlr3pipelines and the SuperLearner package, where we use NLOPT_LN_COBYLA and SuperLearner ... uses the Nelder-Mead method via the optim function to minimize rank loss (from the documentation).
So in order to get exactly the same behaviour, you would need to implement a Nelder-Mead bbotk::Optimizer similar to here that simply wraps stats::optim with method Nelder-Mead and carefully compare settings and stopping criteria. I am fairly confident that NLOPT_LN_COBYLA delivers somewhat comparable results, LeDell2015 has a comparison of the different optimizers for further reference.
Thanks for spotting the error in the documentation. I agree, that the description is a little unclear and I will try to improve this!
I am trying to determine the exact mathematics used to train feed forward networks used for classification in deeplearning4j with stochastic gradient descent. I have tried stepping through the code but am getting lost in the forest.
Is this documented anywhere?
You can find it in the preOutput and backpropGradient methods in the various layers implementations:
Here's the basic implementation used in DenseLayers
And the more complex convolution one used in Convolutions
The implementation might be swapped out when using CuDNN or MKL however.
I need to estimate parameters of continuous-discrete nonlinear stochastic dynamic system using Kalman filtering techniques.
I'm going to use Julia ode45() from ODE and implement Extended Kalman Filter by myself to compute loglikelihood. ODE is written fully in Julia, ForwardDiff supports differentiation of native Julia functions, including nested differentiation, that's what I also need cause I want to use ForwardDiff in my EKF implementation.
Will ForwardDiff handle differentiation of such a comprehensive function like the loglikelihood I've described?
ODE.jl is in maintenance mode so I would recommend using DifferentialEquations.jl instead. In the DiffEq FAQ there is an explanation about using ForwardDiff through the ODE solvers. It works, but as in the FAQ I would recommend using sensitivity analysis since that's a better way of calculating the derivatives (it will take a lot less compilation time). But yes, DiffEqParamEstim.jl is a whole repository for parameter estimation of ODEs/SDEs/DAEs/DDEs and it uses ForwardDiff.jl through the solvers.
(BTW, what you're looking to do sounds interesting. Feel free to get in touch with us in the JuliaDiffEq channel to talk about the development of parameter estimation tooling!)
I'm checking a simple moving average crossing strategy in R. Instead of running a huge simulation over the 2 dimenional parameter space (length of short term moving average, length of long term moving average), I'd like to implement the Particle Swarm Optimization algorithm to find the optimal parameter values. I've been browsing through the web and was reading that this algorithm was very effective. Moreover, the way the algorithm works fascinates me...
Does anybody of you guys have experience with implementing this algorithm in R? Are there useful packages that can be used?
Thanks a lot for your comments.
Martin
Well, there is a package available on CRAN called pso, and indeed it is a particle swarm optimizer (PSO).
I recommend this package.
It is under actively development (last update 22 Sep 2010) and is consistent with the reference implementation for PSO. In addition, the package includes functions for diagnostics and plotting results.
It certainly appears to be a sophisticated package yet the main function interface (the function psoptim) is straightforward--just pass in a few parameters that describe your problem domain, and a cost function.
More precisely, the key arguments to pass in when you call psoptim:
dimensions of the problem, as a vector
(par);
lower and upper bounds for each
variable (lower, upper); and
a cost function (fn)
There are other parameters in the psoptim method signature; those are generally related to convergence criteria and the like).
Are there any other PSO implementations in R?
There is an R Package called ppso for (parallel PSO). It is available on R-Forge. I do not know anything about this package; i have downloaded it and skimmed the documentation, but that's it.
Beyond those two, none that i am aware of. About three months ago, I looked for R implementations of the more popular meta-heuristics. This is the only pso implementation i am aware of. The R bindings to the Gnu Scientific Library GSL) has a simulated annealing algorithm, but none of the biologically inspired meta-heuristics.
The other place to look is of course the CRAN Task View for Optimization. I did not find another PSO implementation other than what i've recited here, though there are quite a few packages listed there and most of them i did not check other than looking at the name and one-sentence summary.
I am working on building a Question Classification/Answering corpus as a part of my masters thesis. I'm looking at evaluating my expected answer type taxonomy with respect to inter-rater agreement/reliability, and I was wondering: Does anybody know of any decent (preferably free) Java API(s) that can do this?
I'm reasonably certain all I need is Fleiss' Kappa and Krippendorff's Alpha at this point.
Weka provides a kappa statistic in it's evaluation package, but I think it can only evaluate a classifier and I'm not at that stage yet (because I'm still building the data set and classes).
Thanks.
I ported a matlab implementation of Fleiss' kappa to Python/numpy.
http://code.google.com/p/hydrat/source/browse/src/hydrat/common/fleiss.py
It is not difficult to implement, perhaps you could port it to Java yourself.
Check QDAP (Pittsburg University) Open Source code.
I couldn't find an existing Java API in time for my research, so I ended up implementing both Fleiss' Kappa and Krippendorff's Alpha myself. Preliminary results for our research can be found in this paper.