I am using Distributions.jl since the julia standard library does not support all necessary distributions
Within 1 special function I need the same random numbers. I am looking for a way to use always the same random number generator for this part, but don't now how to pass it to Distributions.jl
Using srand is not want I want, since then the global rng is reset.
No, it's not yet possible, due to the fact that Distributions.jl currently uses the Rmath library for this (see https://github.com/JuliaStats/Distributions.jl/issues/197), but it is on the to-do list.
Related
I have a model where some of the input features are calculated from the training dataset (e.g. average or median of a value). I am trying to perform n-fold cross validation on this model, but that means that the values for these features would be different depending on the samples selected for training/validation for each fold. Is there a way in h2o (I'm using it in R) to perhaps pass a funtion that calculates those features once the training set has been determined?
It seems like a pretty intuitive feature to have, but I have not been able to find any documentation on something like it out-of-the-box. Does it exist? If so, could someone point me to a resource?
There's no way to do this while using the built-in cross-validation in H2O. If H2O were written in pure R or Python, then it would be easy to extend it to allow a user to pass in a function to create custom features within the cross-validation loop, however the core of H2O is written in Java, so automatically translating an arbitrary user-defined function from R or Python, first into a REST call and then into Java is not trivial.
Instead, what you'd have to do is write a loop to do the cross-validation yourself and compute the features within the loop.
It sounds like you may be doing target encoding (or something similar), and if that's the case, you'll be interested in this PR to add target encoding in H2O. In the discussion, we talk about the same issue that you're having.
Assuming that when using the GNU Linear Programming Toolkit, I do not want to optimize in an LP instance, but rather only want to check if there exists a solution (side note: I'm interested in problems in which all variables have real-valued upper and lower bounds, so the overall solution space is bounded).
What is the most efficient way to ask GLPK (using library calls) to check the feasibility of an LP instance? I am aware that there are functions such as "glp_adv_basis" and "glp_cpx_basis" that will try to find an initial solution. But I still need to call glp_solve to check for feasibility, and even when setting the optimization function to be the constant 0 function, GLPK may still perform some work to find an optimum in it. How do I ensure that it doesn't do so?
Another side note: I am using the C interface, if that makes any difference.
I am integrating a system of ode's using the MATLAB utility routine, ode45. I do not have a reliable way to label plots with the parameters used to produce the plotted results. It would be easy if there were a an approved substitute for global variables. It would be possible to write a script that automatically edits the derivative function for each case in order to hard-wire the constants, but there must be a better way.
To specify constants, simply add an equation for each constant and give 0 as its derivative. This adds a column to the result matrix but the constant value is available for use in calculating the other derivatives.
I have a calculation in R that needs to iteratively call a function for a fixed point contraction mapping. I've been using the squarem function out of the SQUAREM package by Ravi Varadhan. Today while trying to figure out a way around an issue I was having with squarem I came across the TURBOEM package, also by Varadhan. At first glance TURBOEM seems to do the same things as SQUAREM, but with additional functionality in some dimensions.
Does anyone know whether one or the other of these packages is preferred, either in general or for particular applications? Is one more current/updated than the other? TURBOEM seems to have the ability to customize the convergence criterion, which might get me out of the current bind I'm in, but I'm concerned there might be other issues. Obviously I can go off and test the corresponding functions from each package, but if someone out there knows some background on the two packages it might save me a ton of time.
There are four underlying SQUAREM algorithms used by each package. They are effectively identical*. You can see the underlying functions for yourself by using:
SQUAREM:::cyclem1
SQUAREM:::cyclem2
SQUAREM:::squarem1
SQUAREM:::squarem2
turboEM:::bodyCyclem1
turboEM:::bodyCyclem2
turboEM:::bodySquarem1
turboEM:::bodySquarem2
* apart from some differences due to the way in which these are used within the packages. And the argument method in SQUAREM is called version in turboEM
I would say turboEM would probably be preferred in general, for the following reasons:
As you mention, turboEM allows the user to select the convergence criterion, either based on the L2-norm of the change in the parameter vector (convtype = "parameter"), the L1-norm of the change in the objective function (convtype = "objfn"), or by specifying a custom function (convfn.user). SQUAREM only checks convergence using the L2-norm of the change in parameter vector.
turboEM can also stop the algorithm prior to convergence based on either the number of iterations (stoptype = "maxiter") or the amount of time elapsed (stoptype = "maxtime"). SQUAREM only stops after the specified number of iterations.
The pconstr and project arguments to turboem allow the user to define parameter space constraints and a function that projects estimates back into the parameter space if these are violated. SQUAREM does not have this functionality.
turboEM can easily apply multiple versions of the algorithm to the same data (e.g. with different orders, step sizes, ...), by providing a vector to the method argument and a list to the control.method argument...
... and it can do this in parallel via the foreach package.
turboEM also offers a convenient interface through which to apply a vanilla EM algorithm, as well as EM acceleration schemes other than SQUAREM: parabolic EM (method = "pem"), dynamic ECME ("decme") and quasi-Newton ("qn").
The turboEM package also provides the turboSim function, which allows the user to easily conduct benchmark studies comparing the different acceleration schemes.
The one downside that I can see to using turboEM instead of SQUAREM is that, if you are really interested in the particulars of the SQUAREM algorithm, the trace provided by squarem gives more specific information (residual, extrapolation, step length) than that provided by turboem (objective function [if calculated], iteration number, L2-norm of parameter change).
One final aside: The current version of SQUAREM on CRAN (v 2016.8-2) has indeed been updated more recently than the version of turboEM on CRAN (v 2014.8-1). However, the NEWS suggests that the only updates to SQUAREM since December 2010 have been vignettes and demos, while turboEM's first release was in December 2011.
Thanks for your interest in SQUAREM and turboEM. I am the author of both packages. In future, you may contact me directly with any questions.
The goals of the 2 packages are different. SQUAREM implements one class of acceleration methods. turboEM on the other hand includes a variety of state-of-art EM acceleration techniques. The goal of turboEM is to provide a go-to-place for all your EM acceleration needs! In particular, turboEM allows you to benchmark the different algorithms for your problem and determine the best one. In my experience, the squarem class of algorithms most often out perform the other 3 classes (quasi-Newton, dynamic EM, and parabolic EM). Hence, you might also directly use the SQUAREM package. However, turboEM has a number of additional features as pointed out by Mark.
Ravi Varadhan
What is the main difference between the two functions, the r Help manual says that gosolnp helps to set the initial parameters correctly. Is there any difference otherwise? Also, if so is the case, how do we determine the correct distribution type for the parameter space?
In my problem, the initial set of parameters is difficult to determine, which is why the optimization problem is used. However, I have idea about the parameter upper and lower bounds.
gsolnp is an extension of solnp, a wrapper, allowing for multiple restarts. Simply put, it uses solnp several times (controllable by n.restarts) to avoid getting stuck in local minima. If your function is known to have no local minima (e.g., it is convex, which can be derived analytically), use solnp to save time. Otherwise, use gsolnp. If you know any additional information (for instance, an area where a global minimum is supposed to be), you may use it for finer control of the starting parameter distribution: see parameters distr and distr.opt.