m-estimate for continuous values - math

I'm building a custom regression tree and want to use m-estimate for pruning.
Does anyone know how to calculate that.
http://www.ailab.si/blaz/predavanja/UISP/slides/uisp07-RegTrees.ppt might help (slide 12, how should Em look like?)

There are a lot of m-estimates. They all boil down to recasting your estimation problem as a minimization problem. If you use squared error as the function you're minimizing, you just get sample mean. If you use absolute value of the error, you get the sample median. The idea is to use a function that is a compromise between these two so that you get some of the efficiency of the mean and some of the robustness of the median.
Once you've picked your function, finding an m-estimate is just an optimization problem. So your question really boils down to one of finding optimization software. If your optimization problem is convex (and you can pick your m-estimator so that the problem is convex) then there's a lot of high quality software out there.

Related

Should linear solver be converging when Coloring is beign computed

I had to add some circular dependencies to my model and thus adding NonlinearBlockGS and LinearBlockGS to the Group with the circular dependency. I get messages like this
LN: LNBGSSolver 'LN: LNBGS' on system 'XXX' failed to converge in 10
iterations.
in the phase where it's finding the Coloring of the problem. There is a Dymos trajectory as part of the problem, but the circular dependency is not in the Trajectory group, it's upstream. It however converges very easily when actually solving the problem. The number of FWD solves is the same as it was before-- everything seem to work fine. Should I be worried about anything?
the way our total derivative coloring works is that we replace partial derivatives with random numbers and then solve the linear system. So the linear solver should be converging. Now, whether or not it should converge with LNBGS in 10 iterations... probably not.
Its hard to speak diffinitively when putting random numbers into a matrix to invert it... but generally speaking it should remain invertible (though we can't promise). That does not mean that it will remain easily invertible. How close does the linear residual get during the coloring? it is decreasing, but slowly. Would more iteration let it get there?
If your problem is working well, I don't think you need to freak out about this. If you would like it to converge better, it won't hurt anything and might give you better coloring. You can increase the iprint of that solver to get more information on the convergence history.
Another option, if your system is small enough, is to try using the DirectSolver instead of LNBGS. For most models with less than 10,000 variables in them a DirectSolver will be overall faster than the LNBGS. There is a nice symetry to using LNBGS with NLGBS ... but while the nonlinear solver tends to be a good choice (i.e. fast and stable) for cyclic dependencies the same can't be said for its linear counter part.
So my go-to combination if NLBGS and DirectSolver. You can't always use the DirectSolver. If you have distributed components in your model, or components that use the matrix-free derivative APIs (apply_linear, compute_jacvec_product), then LNBGS is a good option. But if everything is explicit components with compute_partials or implicit components that provide partials in the linearize method then I suggest using the DirectSolver as your first option.
I think you may have discovered a coloring performance issue in OpenMDAO. When we compute coloring, internally we replace the component partials with random arrays matching the declared sparsity. Since we're not trying to find an actual solution when we compute coloring, we probably don't need to iterate more than once in any given system. And we shouldn't be generating convergence warnings when computing the coloring. I don't think you need to be worried in this case. I'll put a story in our bug tracker to look into this.

A Naive Question about Inference/Regression

I recently came across a question about the statistical inference of an estimator, but I am not sure about how to do the inference part. Let me explain my question first:
Say I obtained a coefficient from a regression, say a_1. And I ran a second regression, and obtain a_2. Two regressions are using different samples. Then, I take the difference between the two estiamtes: D = a_1 - a_2. Now, I need to know whether D is statistically different from 0.
My question is how to do this. Is it the same as the comparison of two means as it stated in this link (http://www.stat.yale.edu/Courses/1997-98/101/meancomp.htm)? From my understanding, the point estimates are mean of the distribution of the coefficient, but I am not sure how to specify the number of observations in the formula shown in the above link.
Also the above step is a parametric method, from my understanding. Should I use bootstrap instead?
Could someone please help with my understanding? or maybe guide me to some good reference to follow?
Best

How to estimate gamma and cost parameters for SVM quickly

I want to train SVMs in R and I know there are functions such as e1071::tune.svm() that can be used to find the optimal parameters for the SVM. However, it seems there are some formulas out there (e.g. used in this report) that can give you a reasonable estimate of these parameters.
Since a grid-search for the parameters can take quite a lot of time on larger datasets and usually, one has to provide a range of possible values anyway, I wondered whether there is a package that implements formulas to get a quick estimate for the gamma and cost parameters for the SVM?
So far, I've found out that caret::train() might use such an approach to estimate sigma (which should be the reciprocal of 2*gamma^2) but I haven't tried it yet, since other calculations are still running (and will be, probably for the next days). Is there also an implementation to estimate cost or at least give a range of reasonable values?
I have found a similar question that asks for alternatives to grid-search in general. However, I would be interested in an R implementation of such alternatives and also, I hope things have developed further since the more general question was posted years ago.

R: Evaluate Gradient Boosting Machines (GBM) for Regression

Which are the best metrics to evaluate the fit of a GBM algorithm in R (metrics, graphs, ratios)? And how interpret them?
I think maybe you are overthinking this one! Take a step back and think about what matters... the error. You have forecasted values and you have observed values. the difference tells you most of what you need to know when comparing across models. Basic measures like MSE, MPE, etc. should do fine. If you are looking to refine within a given model, I would recommend taking a look at the gbm documentation. For example, you can pass your gbm model object to summary(), to get the relative influence of each of your variables. Additionally, you can find a lot of information in the documentation, so if you haven't taken a look, I would recommend doing so! I have posted the link at the bottom.
-Carmine
gbm_documentation

How to find several solutions of nonlinear equation using R e.g. nleqslv?

As far as I understand R's nonlinear equation solver nleqslv(x, fn) finds only one solution of the nonlinear equation.
However (as Bhas commented) searchZeros function (the same package) can find my solutions depending on the starting points.
Question: are there some function in R which can help choosing the set of initial points for searchZeros ,which will help me to find all the solutions ?
I am interested in the case of function with several variables.
I undestand that solution to be found pretty much depends on the initial approximation. So the brute force way is to check some reasonable grid of intial approximations. However there might be some more intelligent way to get all the solutions ?

Resources