Related
I use extensively the julia's linear equation solver res = X\b. I have to use it millions of times in my program because of parameter variation. This was working ok because I was using small dimensions (up to 30). Now that I want to analyse bigger systems, up to 1000, the linear solver is no longer efficient.
I think there can be a work around. However I must say that sometimes my X matrix is dense, and sometimes is sparse, so I need something that works fine for both cases.
The b vector is a vector with all zeroes, except for one entry which is always 1 (actually it is always the last entry). Moreover, I don't need all the res vector, just the first entry of it.
If your problem is of the form (A - µI)x = b, where µ is a variable parameter and A, b are fixed, you might work with diagonalization.
Let A = PDP° where P° denotes the inverse of P. Then (PDP° - µI)x = b can be transformed to
(D - µI)P°x = P°b,
P°x = P°b / (D - µI),
x = P(P°b / (D - µI)).
(the / operation denotes the division of the respective vector elements by the scalars Dr - µ.)
After you have diagonalized A, computing a solution for any µ reduces to two matrix/vector products, or a single one if you can also precompute P°b.
Numerical instability will show up in the vicinity of the Eigenvalues of A.
Usually when people talk about speeding up linear solvers res = X \ b, it’s for multiple bs. But since your b isn’t changing, and you just keep changing X, none of those tricks apply.
The only way to speed this up, from a mathematical perspective, seems to be to ensure that Julia is picking the fastest solver for X \ b, i.e., if you know X is positive-definite, use Cholesky, etc. Matlab’s flowcharts for how it picks the solver to use for X \ b, for dense and sparse X, are available—most likely Julia implements something close to these flowcharts too, but again, maybe you can find some way to simplify or shortcut it.
All programming-related speedups (multiple threads—while each individual solver is probably already multi-threaded, it may be worth running multiple solvers in parallel when each solver uses fewer threads than cores; #simd if you’re willing to dive into the solvers themselves; OpenCL/CUDA libraries; etc.) then can be applied.
Best approach for efficiency would be to use: JuliaMath/IterativeSolvers.jl. For A * x = b problems, I would recommend x = lsmr(A, b).
Second best alternatives would be to give a bit more information to the compiler: instead of x = inv(A'A) * A' * b, do x = inv(cholfact(A'A)) A' * b if Cholesky decomposition works for you. Otherwise, you could try U, S, Vt = svd(A) and x = Vt' * diagm(sqrt.(S)) * U' * b.
Unsure if x = pinv(A) * b is optimized, but might be slightly more efficient than x = A \ b.
I am trying to calculate point on a line.
I got the points of the edges and one distance between one edge to the point I want to find (which is B).
A(2,4)
B(x,y)
C(4,32)
The distance between A to B is 5.
How can I calculate Bx and By? using the following equations:
d = Math.Sqr((Bx-Ax)^2 + (By-Ay)^2)
d = Math.Sqr((Cx-Bx)^2 + (Cy-By)^2)
and than compare the equations above.
Here is the equations with the points placed:
5 = Math.Sqr((Bx-2)^2 + (By-4)^2)
23.0713366 = Math.Sqr((4-Bx)^2 + (32-By)^2)
or
Math.Sqr((Bx-2)^2 + (By-4)^2) - 5 = Math.Sqr((4-Bx)^2 + (32-By)^2) - 23.0713377
How can I solve this using VBA?
Thank you!
I won't solve your equations above because they are an unnecessarily complex way to state the problem (and the existence of a solution is questionable in the presence of rounding), but all the points on the line A=(Ax,Ay) to C=(Cx,Cy) can be described as B=(Ax,Ay) + t*(Cx-Ax,Cy-Ay) with t between 0 and 1.
The distance between B and A is then given by d=t*Sqrt((Cx-Ax)^2+(Cy-Ay)^2), which you can invert to get the proper t for a given d - t=d/Sqrt((Cx-Ax)^2+(Cy-Ay)^2)
In your case, B(t) = (2,4) + t*(2,28), t=5/Sqrt(2^2+28^2) ~ 0.178 -> B ~ (2,4) + 0.178 * (2,28) ~ (2.356, 8.987).
VBA has no Symbolic Language capability. To solve this problem, there are different approach :
Transform the equations to isolate one of the unknowns, most likely to use substitution, and compute it (I recommend this for your problem.)
Transform your functions and derive them to use Newton's methods (don't do this, it's overkill.)
Use a "brute force" convergence methods : Fix a min/max for each variable and use bisection methods to find what you want (I don't recommend this because you'll most likely "fall" into a local minimum/maximum in your case.)
So basically, I'd say you go with the first way. It requires 15mins of tinkering with mathematical equations, then you're set to go.
If I have a general function,f(z,a), z and a are both real, and the function f takes on real values for all z except in some interval (z1,z2), where it becomes complex. How do I determine z1 and z2 (which will be in terms of a) using Mathematica (or is this possible)? What are the limitations?
For a test example, consider the function f[z_,a_]=Sqrt[(z-a)(z-2a)]. For real z and a, this takes on real values except in the interval (a,2a), where it becomes imaginary. How do I find this interval in Mathematica?
In general, I'd like to know how one would go about finding it mathematically for a general case. For a function with just two variables like this, it'd probably be straightforward to do a contour plot of the Riemann surface and observe the branch cuts. But what if it is a multivariate function? Is there a general approach that one can take?
What you have appears to be a Riemann surface parametrized by 'a'. Consider the algebraic (or analytic) relation g(a,z)=0 that would be spawned from this branch of a parametrized Riemann surface. In this case it is simply g^2 - (z - a)*(z - 2*a) == 0. More generally it might be obtained using Groebnerbasis, as below (no guarantee this will always work without some amount of user intervention).
grelation = First[GroebnerBasis[g - Sqrt[(z - a)*(z - 2*a)], {x, a, g}]]
Out[472]= 2 a^2 - g^2 - 3 a z + z^2
A necessary condition for the branch points, as functions of the parameter 'a', is that the zero set for 'g' not give a (single valued) function in a neighborhood of such points. This in turn means that the partial derivative of this relation with respect to g vanishes (this is from the implicit function theorem of multivariable calculus). So we find where grelation and its derivative both vanish, and solve for 'z' as a function of 'a'.
Solve[Eliminate[{grelation == 0, D[grelation, g] == 0}, g], z]
Out[481]= {{z -> a}, {z -> 2 a}}
Daniel Lichtblau
Wolfram Research
For polynomial systems (and some class of others), Reduce can do the job.
E.g.
In[1]:= Reduce[Element[{a, z}, Reals]
&& !Element[Sqrt[(z - a) (z - 2 a)], Reals], z]
Out[1]= (a < 0 && 2a < z < a) || (a > 0 && a < z < 2a)
This type of approach also works (often giving very complicated solutions for functions with many branch cuts) for other combinations of elementary functions I checked.
To find the branch cuts (as opposed to the simple class of branch points you're interested in) in general, I don't know of a good approach. The best place to find the detailed conventions that Mathematica uses is at the functions.wolfram site.
I do remember reading a good paper on this a while back... I'll try to find it....
That's right! The easiest approach I've seen for branch cut analysis uses the unwinding number. There's a paper "Reasoning about the elementary functions of complex analysis" about this the the journal "Artificial Intelligence and Symbolic Computation". It and similar papers can be found at one of the authors homepage: http://www.apmaths.uwo.ca/~djeffrey/offprints.html.
For general functions you cannot make Mathematica calculate it.
Even for polynomials, finding an exact answer takes time.
I believe Mathematica uses some sort of quantifier elimination when it uses Reduce,
which takes time.
Without any restrictions on your functions (are they polynomials, continuous, smooth?)
one can easily construct functions which Mathematica cannot simplify further:
f[x_,y_] := Abs[Zeta[y+0.5+x*I]]*I
If this function is real for arbitrary x and any -0.5 < y < 0 or 0<y<0.5,
then you will have found a counterexample to the Riemann zeta conjecture,
and I'm sure Mathematica cannot give a correct answer.
This should be very simple. I have a function f(x), and I want to evaluate f'(x) for a given x in MATLAB.
All my searches have come up with symbolic math, which is not what I need, I need numerical differentiation.
E.g. if I define: fx = inline('x.^2')
I want to find say f'(3), which would be 6, I don't want to find 2x
If your function is known to be twice differentiable, use
f'(x) = (f(x + h) - f(x - h)) / 2h
which is second order accurate in h. If it is only once differentiable, use
f'(x) = (f(x + h) - f(x)) / h (*)
which is first order in h.
This is theory. In practice, things are quite tricky. I'll take the second formula (first order) as the analysis is simpler. Do the second order one as an exercise.
The very first observation is that you must make sure that (x + h) - x = h, otherwise you get huge errors. Indeed, f(x + h) and f(x) are close to each other (say 2.0456 and 2.0467), and when you substract them, you lose a lot of significant figures (here it is 0.0011, which has 3 significant figures less than x). So any error on h is likely to have a huge impact on the result.
So, first step, fix a candidate h (I'll show you in a minute how to chose it), and take as h for your computation the quantity h' = (x + h) - x. If you are using a language like C, you must take care to define h or x as volatile for that computation not to be optimized away.
Next, the choice of h. The error in (*) has two parts: the truncation error and the roundoff error. The truncation error is because the formula is not exact:
(f(x + h) - f(x)) / h = f'(x) + e1(h)
where e1(h) = h / 2 * sup_{x in [0,h]} |f''(x)|.
The roundoff error comes from the fact that f(x + h) and f(x) are close to each other. It can be estimated roughly as
e2(h) ~ epsilon_f |f(x) / h|
where epsilon_f is the relative precision in the computation of f(x) (or f(x + h), which is close). This has to be assessed from your problem. For simple functions, epsilon_f can be taken as the machine epsilon. For more complicated ones, it can be worse than that by orders of magnitude.
So you want h which minimizes e1(h) + e2(h). Plugging everything together and optimizing in h yields
h ~ sqrt(2 * epsilon_f * f / f'')
which has to be estimated from your function. You can take rough estimates. When in doubt, take h ~ sqrt(epsilon) where epsilon = machine accuracy. For the optimal choice of h, the relative accuracy to which the derivative is known is sqrt(epsilon_f), ie. half the significant figures are correct.
In short: too small a h => roundoff error, too large a h => truncation error.
For the second order formula, same computation yields
h ~ (6 * epsilon_f / f''')^(1/3)
and a fractional accuracy of (epsilon_f)^(2/3) for the derivative (which is typically one or two significant figures better than the first order formula, assuming double precision).
If this is too imprecise, feel free to ask for more methods, there are a lot of tricks to get better accuracy. Richardson extrapolation is a good start for smooth functions. But those methods typically compute f quite a few times, this may or not be what you want if your function is complex.
If you are going to use numerical derivatives a lot of times at different points, it becomes interesting to construct a Chebyshev approximation.
To get a numerical difference (symmetric difference), you calculate (f(x+dx)-f(x-dx))/(2*dx)
fx = #(x)x.^2;
fPrimeAt3 = (fx(3.1)-fx(2.9))/0.2;
Alternatively, you can create a vector of function values and apply DIFF, i.e.
xValues = 2:0.1:4;
fValues = fx(xValues);
df = diff(fValues)./0.1;
Note that diff takes the forward difference, and that it assumes that dx equals to 1.
However, in your case, you may be better off to define fx as a polynomial, and evaluating the derivative of the function, rather than the function values.
Lacking the symbolic toolbox, nothing stops you from using Derivest, a tool for automatic adaptive numerical differentiation.
derivest(#sin,pi)
ans =
-1
For your example it does very nicely. In fact, it even provides an estimate of the error in the resulting approximation.
fx = inline('x.^2');
[fp,errest] = derivest(fx,3)
fp =
6
errest =
3.6308e-14
did you try diff (calculates differences and approximates a derivative), gradient, or polyder (calculates the derivative of a polynomial) functions?
You can read more on these functions by using help <commandname> on MATLAB console, or use the function browser in the Help menu.
For a given function in analytical form, you can evaluate the derivative at a desired point with the following code:
syms x
df = diff(x^2);
df3 = subs(df, 'x', 3);
fprintf('f''(3)=%f\n', df3);
For pure numerical derivatives use the already given solutions by Jonas and posdef.
Is there a way, given a set of values (x,f(x)), to find the polynomial of a given degree that best fits the data?
I know polynomial interpolation, which is for finding a polynomial of degree n given n+1 data points, but here there are a large number of values and we want to find a low-degree polynomial (find best linear fit, best quadratic, best cubic, etc.). It might be related to least squares...
More generally, I would like to know the answer when we have a multivariate function -- points like (x,y,f(x,y)), say -- and want to find the best polynomial (p(x,y)) of a given degree in the variables. (Specifically a polynomial, not splines or Fourier series.)
Both theory and code/libraries (preferably in Python, but any language is okay) would be useful.
Thanks for everyone's replies. Here is another attempt at summarizing them. Pardon if I say too many "obvious" things: I knew nothing about least squares before, so everything was new to me.
NOT polynomial interpolation
Polynomial interpolation is fitting a polynomial of degree n given n+1 data points, e.g. finding a cubic that passes exactly through four given points. As said in the question, this was not want I wanted—I had a lot of points and wanted a small-degree polynomial (which will only approximately fit, unless we've been lucky)—but since some of the answers insisted on talking about it, I should mention them :) Lagrange polynomial, Vandermonde matrix, etc.
What is least-squares?
"Least squares" is a particular definition/criterion/"metric" of "how well" a polynomial fits. (There are others, but this is simplest.) Say you are trying to fit a polynomial
p(x,y) = a + bx + cy + dx2 + ey2 + fxy
to some given data points (xi,yi,Zi) (where "Zi" was "f(xi,yi)" in the question). With least-squares the problem is to find the "best" coefficients (a,b,c,d,e,f), such that what is minimized (kept "least") is the "sum of squared residuals", namely
S = ∑i (a + bxi + cyi + dxi2 + eyi2 + fxiyi - Zi)2
Theory
The important idea is that if you look at S as a function of (a,b,c,d,e,f), then S is minimized at a point at which its gradient is 0. This means that for example ∂S/∂f=0, i.e. that
∑i2(a + … + fxiyi - Zi)xiyi = 0
and similar equations for a, b, c, d, e.
Note that these are just linear equations in a…f. So we can solve them with Gaussian elimination or any of the usual methods.
This is still called "linear least squares", because although the function we wanted was a quadratic polynomial, it is still linear in the parameters (a,b,c,d,e,f). Note that the same thing works when we want p(x,y) to be any "linear combination" of arbitrary functions fj, instead of just a polynomial (= "linear combination of monomials").
Code
For the univariate case (when there is only variable x — the fj are monomials xj), there is Numpy's polyfit:
>>> import numpy
>>> xs = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> ys = [1.1, 3.9, 11.2, 21.5, 34.8, 51, 70.2, 92.3, 117.4, 145.5]
>>> p = numpy.poly1d(numpy.polyfit(xs, ys, deg=2))
>>> print p
2
1.517 x + 2.483 x + 0.4927
For the multivariate case, or linear least squares in general, there is SciPy. As explained in its documentation, it takes a matrix A of the values fj(xi). (The theory is that it finds the Moore-Penrose pseudoinverse of A.) With our above example involving (xi,yi,Zi), fitting a polynomial means the fj are the monomials x()y(). The following finds the best quadratic (or best polynomial of any other degree, if you change the "degree = 2" line):
from scipy import linalg
import random
n = 20
x = [100*random.random() for i in range(n)]
y = [100*random.random() for i in range(n)]
Z = [(x[i]+y[i])**2 + 0.01*random.random() for i in range(n)]
degree = 2
A = []
for i in range(n):
A.append([])
for xd in range(degree+1):
for yd in range(degree+1-xd):
A[i].append((x[i]**xd)*(y[i]**yd)) #f_j(x_i)
c,_,_,_ = linalg.lstsq(A,Z)
j = 0
for xd in range(0,degree+1):
for yd in range(0,degree+1-xd):
print " + (%.2f)x^%dy^%d" % (c[j], xd, yd),
j += 1
prints
+ (0.01)x^0y^0 + (-0.00)x^0y^1 + (1.00)x^0y^2 + (-0.00)x^1y^0 + (2.00)x^1y^1 + (1.00)x^2y^0
so it has discovered that the polynomial is x2+2xy+y2+0.01. [The last term is sometimes -0.01 and sometimes 0, which is to be expected because of the random noise we added.]
Alternatives to Python+Numpy/Scipy are R and Computer Algebra Systems: Sage, Mathematica, Matlab, Maple. Even Excel might be able to do it. Numerical Recipes discusses methods to implement it ourselves (in C, Fortran).
Concerns
It is strongly influenced by how the points are chosen. When I had x=y=range(20) instead of the random points, it always produced 1.33x2+1.33xy+1.33y2, which was puzzling... until I realised that because I always had x[i]=y[i], the polynomials were the same: x2+2xy+y2 = 4x2 = (4/3)(x2+xy+y2). So the moral is that it is important to choose the points carefully to get the "right" polynomial. (If you can chose, you should choose Chebyshev nodes for polynomial interpolation; not sure if the same is true for least squares as well.)
Overfitting: higher-degree polynomials can always fit the data better. If you change the degree to 3 or 4 or 5, it still mostly recognizes the same quadratic polynomial (coefficients are 0 for higher-degree terms) but for larger degrees, it starts fitting higher-degree polynomials. But even with degree 6, taking larger n (more data points instead of 20, say 200) still fits the quadratic polynomial. So the moral is to avoid overfitting, for which it might help to take as many data points as possible.
There might be issues of numerical stability I don't fully understand.
If you don't need a polynomial, you can obtain better fits with other kinds of functions, e.g. splines (piecewise polynomials).
Yes, the way this is typically done is by using least squares. There are other ways of specifying how well a polynomial fits, but the theory is simplest for least squares. The general theory is called linear regression.
Your best bet is probably to start with Numerical Recipes.
R is free and will do everything you want and more, but it has a big learning curve.
If you have access to Mathematica, you can use the Fit function to do a least squares fit. I imagine Matlab and its open source counterpart Octave have a similar function.
For (x, f(x)) case:
import numpy
x = numpy.arange(10)
y = x**2
coeffs = numpy.polyfit(x, y, deg=2)
poly = numpy.poly1d(coeffs)
print poly
yp = numpy.polyval(poly, x)
print (yp-y)
Bare in mind that a polynomial of higher degree ALWAYS fits the data better. Polynomials of higher degree typically leads to highly improbable functions (see Occam's Razor), though (overfitting). You want to find a balance between simplicity (degree of polynomial) and fit (e.g. least square error). Quantitatively, there are tests for this, the Akaike Information Criterion or the Bayesian Information Criterion. These tests give a score which model is to be prefered.
If you want to fit the (xi, f(xi)) to an polynomial of degree n then you would set up a linear least squares problem with the data (1, xi, xi, xi^2, ..., xi^n, f(xi) ). This will return a set of coefficients (c0, c1, ..., cn) so that the best fitting polynomial is *y = c0 + c1 * x + c2 * x^2 + ... + cn * x^n.*
You can generalize this two more than one dependent variable by including powers of y and combinations of x and y in the problem.
Lagrange polynomials (as #j w posted) give you an exact fit at the points you specify, but with polynomials of degree more than say 5 or 6 you can run into numerical instability.
Least squares gives you the "best fit" polynomial with error defined as the sum of squares of the individual errors. (take the distance along the y-axis between the points you have and the function that results, square them, and sum them up) The MATLAB polyfit function does this, and with multiple return arguments, you can have it automatically take care of scaling/offset issues (e.g. if you have 100 points all between x=312.1 and 312.3, and you want a 6th degree polynomial, you're going to want to calculate u = (x-312.2)/0.1 so the u-values are distributed between -1 and +=).
NOTE that the results of least-squares fits are strongly influenced by the distribution of x-axis values. If the x-values are equally spaced, then you'll get larger errors at the ends. If you have a case where you can choose the x values and you care about the maximum deviation from your known function and an interpolating polynomial, then the use of Chebyshev polynomials will give you something that is close to the perfect minimax polynomial (which is very hard to calculate). This is discussed at some length in Numerical Recipes.
Edit: From what I gather, this all works well for functions of one variable. For multivariate functions it is likely to be much more difficult if the degree is more than, say, 2. I did find a reference on Google Books.
at college we had this book which I still find extremely useful: Conte, de Boor; elementary numerical analysis; Mc Grow Hill. The relevant paragraph is 6.2: Data Fitting.
example code comes in FORTRAN, and the listings are not very readable either, but the explanations are deep and clear at the same time. you end up understanding what you are doing, not just doing it (as is my experience of Numerical Recipes).
I usually start with Numerical Recipes but for things like this I quickly have to grab Conte-de Boor.
maybe better posting some code... it's a bit stripped down, but the most relevant parts are there. it relies on numpy, obviously!
def Tn(n, x):
if n==0:
return 1.0
elif n==1:
return float(x)
else:
return (2.0 * x * Tn(n - 1, x)) - Tn(n - 2, x)
class ChebyshevFit:
def __init__(self):
self.Tn = Memoize(Tn)
def fit(self, data, degree=None):
"""fit the data by a 'minimal squares' linear combination of chebyshev polinomials.
cfr: Conte, de Boor; elementary numerical analysis; Mc Grow Hill (6.2: Data Fitting)
"""
if degree is None:
degree = 5
data = sorted(data)
self.range = start, end = (min(data)[0], max(data)[0])
self.halfwidth = (end - start) / 2.0
vec_x = [(x - start - self.halfwidth)/self.halfwidth for (x, y) in data]
vec_f = [y for (x, y) in data]
mat_phi = [numpy.array([self.Tn(i, x) for x in vec_x]) for i in range(degree+1)]
mat_A = numpy.inner(mat_phi, mat_phi)
vec_b = numpy.inner(vec_f, mat_phi)
self.coefficients = numpy.linalg.solve(mat_A, vec_b)
self.degree = degree
def evaluate(self, x):
"""use Clenshaw algorithm
http://en.wikipedia.org/wiki/Clenshaw_algorithm
"""
x = (x-self.range[0]-self.halfwidth) / self.halfwidth
b_2 = float(self.coefficients[self.degree])
b_1 = 2 * x * b_2 + float(self.coefficients[self.degree - 1])
for i in range(2, self.degree):
b_1, b_2 = 2.0 * x * b_1 + self.coefficients[self.degree - i] - b_2, b_1
else:
b_0 = x*b_1 + self.coefficients[0] - b_2
return b_0
Remember, there's a big difference between approximating the polynomial and finding an exact one.
For example, if I give you 4 points, you could
Approximate a line with a method like least squares
Approximate a parabola with a method like least squares
Find an exact cubic function through these four points.
Be sure to select the method that's right for you!
It's rather easy to scare up a quick fit using Excel's matrix functions if you know how to represent the least squares problem as a linear algebra problem. (That depends on how reliable you think Excel is as a linear algebra solver.)
The lagrange polynomial is in some sense the "simplest" interpolating polynomial that fits a given set of data points.
It is sometimes problematic because it can vary wildly between data points.