CLRS exercise 3.2-4 Big-Oh vs Little Oh - math

I'm self studying CLRS, and I've hit this point - the question I'm answering is:
Is the function ⌈lglgn⌉! polynomially bounded?
And I've reduced it down to
=Θ(lglgn⋅lglglgn)
Now, all the solutions manuals seem to use little oh at this point to get it down to
=o(lglgn⋅lglgn)
And this step confounds me a little; I thought I understood little-oh, but clearly not well enough - can somebody frame it within this particular context? Also the next steps go from
=o(lg^2 n)
to
=o(lgn)
is this merely an application of L'hopitals rule?

If you have a function that is asymptotically equivalent to lglgn⋅lglglgn (so it is in Θ(lglgn⋅lglglgn)), then lglgn⋅lglgn is an upper bound since lglglgn is in o(lglgn).
I'm not sure about the last step:
If o(lg^2 n) means o((lg n)^2), you cannot say it is in o(lg n). This is just wrong.
If o(lg^2 n) means o(lglg n), this is just switching to a larger upper bound due to lglg n is in o(ln n).

Related

Is there a more efficient way of nesting logarithms?

This is a continuation of the two questions posted here,
Declaring a functional recursive sequence in Matlab
Nesting a specific recursion in Pari-GP
To make a long story short, I've constructed a family of functions which solve the tetration functional equation. I've proven these things are holomorphic. And now it's time to make the graphs, or at least, somewhat passable code to evaluate these things. I've managed to get to about 13 significant digits in my precision, but if I try to get more, I encounter a specific error. That error is really nothing more than an overflow error. But it's a peculiar overflow error; Pari-GP doesn't seem to like nesting the logarithm.
My particular mathematical function is approximated by taking something large (think of the order e^e^e^e^e^e^e) to produce something small (of the order e^(-n)). The math inherently requires samples of large values to produce these small values. And strangely, as we get closer to numerically approximating (at about 13 significant digits or so), we also get closer to overflowing because we need such large values to get those 13 significant digits. I am a god awful programmer; and I'm wondering if there could be some work around I'm not seeing.
/*
This function constructs the approximate Abel function
The variable z is the main variable we care about; values of z where real(z)>3 almost surely produces overflow errors
The variable l is the multiplier of the approximate Abel function
The variable n is the depth of iteration required
n can be set to 100, but produces enough accuracy for about 15
The functional equation this satisfies is exp(beta_function(z,l,n))/(1+exp(-l*z)) = beta_function(z+1,l,n); and this program approaches the solution for n to infinity
*/
beta_function(z,l,n) =
{
my(out = 0);
for(i=0,n-1,
out = exp(out)/(exp(l*(n-i-z)) +1));
out;
}
/*
This function is the error term between the approximate Abel function and the actual Abel function
The variable z is the main variable we care about
The variable l is the multiplier
The variable n is the depth of iteration inherited from beta_function
The variable k is the new depth of iteration for this function
n can be set about 100, still; but 15 or 20 is more optimal.
Setting the variable k above 10 will usually produce overflow errors unless the complex arguments of l and z are large.
Precision of about 10 digits is acquired at k = 5 or 6 for real z, for complex z less precision is acquired. k should be set to large values for complex z and l with large imaginary arguments.
*/
tau_K(z,l,n,k)={
if(k == 1,
-log(1+exp(-l*z)),
log(1 + tau_K(z+1,l,n,k-1)/beta_function(z+1,l,n)) - log(1+exp(-l*z))
)
}
/*
This is the actual Abel function
The variable z is the main variable we care about
The variable l is the multiplier
The variable n is the depth of iteration inherited from beta_function
The variable k is the depth of iteration inherited from tau_K
The functional equation this satisfies is exp(Abl_L(z,l,n,k)) = Abl_L(z+1,l,n,k); and this function approaches that solution for n,k to infinity
*/
Abl_L(z,l,n,k) ={
beta_function(z,l,n) + tau_K(z,l,n,k);
}
This is the code for approximating the functions I've proven are holomorphic; but sadly, my code is just horrible. Here, is attached some expected output, where you can see the functional equation being satisfied for about 10 - 13 significant digits.
Abl_L(1,log(2),100,5)
%52 = 0.1520155156321416705967746811
exp(Abl_L(0,log(2),100,5))
%53 = 0.1520155156321485241351294757
Abl_L(1+I,0.3 + 0.3*I,100,14)
%59 = 0.3353395055605129001249035662 + 1.113155080425616717814647305*I
exp(Abl_L(0+I,0.3 + 0.3*I,100,14))
%61 = 0.3353395055605136611147422467 + 1.113155080425614418399986325*I
Abl_L(0.5+5*I, 0.2+3*I,100,60)
%68 = -0.2622549204469267170737985296 + 1.453935357725113433325798650*I
exp(Abl_L(-0.5+5*I, 0.2+3*I,100,60))
%69 = -0.2622549205108654273925182635 + 1.453935357685525635276573253*I
Now, you'll notice I have to change the k value for different values. When the arguments z,l are further away from the real axis, we can make k very large (and we have to to get good accuracy), but it'll still overflow eventually; typically once we've achieved about 13-15 significant digits, is when the functions will start to blow up. You'll note, that setting k =60, means we're taking 60 logarithms. This already sounds like a bad idea, lol. Mathematically though, the value Abl_L(z,l,infinity,infinity) is precisely the function I want. I know that must be odd; nested infinite for-loops sounds like nonsense, lol.
I'm wondering if anyone can think of a way to avoid these overflow errors and obtaining a higher degree of accuracy. In a perfect world, this object most definitely converges, and this code is flawless (albeit, it may be a little slow); but we'd probably need to increase the stacksize indefinitely. In theory this is perfectly fine; but in reality, it's more than impractical. Is there anyway, as a programmer, one can work around this?
The only other option I have at this point is to try and create a bruteforce algorithm to discover the Taylor series of this function; but I'm having less than no luck at doing this. The process is very unique, and trying to solve this problem using Taylor series kind of takes us back to square one. Unless, someone here can think of a fancy way of recovering Taylor series from this expression.
I'm open to all suggestions, any comments, honestly. I'm at my wits end; and I'm wondering if this is just one of those things where the only solution is to increase the stacksize indefinitely (which will absolutely work). It's not just that I'm dealing with large numbers. It's that I need larger and larger values to compute a small value. For that reason, I wonder if there's some kind of quick work around I'm not seeing. The error Pari-GP spits out is always with tau_K, so I'm wondering if this has been coded suboptimally; and that I should add something to it to reduce stacksize as it iterates. Or, if that's even possible. Again, I'm a horrible programmer. I need someone to explain this to me like I'm in kindergarten.
Any help, comments, questions for clarification, are more than welcome. I'm like a dog chasing his tail at this point; wondering why he can't take 1000 logarithms, lol.
Regards.
EDIT:
I thought I'd add in that I can produce arbitrary precision but we have to keep the argument of z way off in the left half plane. If the variables n,k = -real(z) then we can produce arbitrary accuracy by making n as large as we want. Here's some output to explain this, where I've used \p 200 and we pretty much have equality at this level (minus some digits).
Abl_L(-1000,1+I,1000,1000)
%16 = -0.29532276871494189936534470547577975723321944770194434340228137221059739121428422475938130544369331383702421911689967920679087535009910425871326862226131457477211238400580694414163545689138863426335946 + 1.5986481048938885384507658431034702033660039263036525275298731995537068062017849201570422126715147679264813047746465919488794895784667843154275008585688490133825421586142532469402244721785671947462053*I
exp(Abl_L(-1001,1+I,1000,1000))
%17 = -0.29532276871494189936534470547577975723321944770194434340228137221059739121428422475938130544369331383702421911689967920679087535009910425871326862226131457477211238400580694414163545689138863426335945 + 1.5986481048938885384507658431034702033660039263036525275298731995537068062017849201570422126715147679264813047746465919488794895784667843154275008585688490133825421586142532469402244721785671947462053*I
Abl_L(-900 + 2*I, log(2) + 3*I,900,900)
%18 = 0.20353875452777667678084511743583613390002687634123569448354843781494362200997943624836883436552749978073278597542986537166527005507457802227019178454911106220050245899257485038491446550396897420145640 - 5.0331931122239257925629364016676903584393129868620886431850253696250415005420068629776255235599535892051199267683839967636562292529054669236477082528566454129529102224074017515566663538666679347982267*I
exp(Abl_L(-901+2*I,log(2) + 3*I,900,900))
%19 = 0.20353875452777667678084511743583613390002687634123569448354843781494362200997943624836883436552749978073278597542986537166527005507457802227019178454911106220050245980468697844651953381258310669530583 - 5.0331931122239257925629364016676903584393129868620886431850253696250415005420068629776255235599535892051199267683839967636562292529054669236477082528566454129529102221938340371793896394856865112060084*I
Abl_L(-967 -200*I,12 + 5*I,600,600)
%20 = -0.27654907399026253909314469851908124578844308887705076177457491260312326399816915518145788812138543930757803667195961206089367474489771076618495231437711085298551748942104123736438439579713006923910623 - 1.6112686617153127854042520499848670075221756090591592745779176831161238110695974282839335636124974589920150876805977093815716044137123254329208112200116893459086654166069454464903158662028146092983832*I
exp(Abl_L(-968 -200*I,12 + 5*I,600,600))
%21 = -0.27654907399026253909314469851908124578844308887705076177457491260312326399816915518145788812138543930757803667195961206089367474489771076618495231437711085298551748942104123731995533634133194224880928 - 1.6112686617153127854042520499848670075221756090591592745779176831161238110695974282839335636124974589920150876805977093815716044137123254329208112200116893459086654166069454464833417170799085356582884*I
The trouble is, we can't just apply exp over and over to go forward and expect to keep the same precision. The trouble is with exp, which displays so much chaotic behaviour as you iterate it in the complex plane, that this is doomed to work.
Well, I answered my own question. #user207421 posted a comment, and I'm not sure if it meant what I thought it meant, but I think it got me to where I want. I sort of assumed that exp wouldn't inherit the precision of its argument, but apparently that's true. So all I needed was to define,
Abl_L(z,l,n,k) ={
if(real(z) <= -max(n,k),
beta_function(z,l,n) + tau_K(z,l,n,k),
exp(Abl_L(z-1,l,n,k)));
}
Everything works perfectly fine from here; of course, for what I need it for. So, I answered my own question, and it was pretty simple. I just needed an if statement.
Thanks anyway, to anyone who read this.

Efficiently finding the closest zero of an arbitrary function

In summary, I am trying to start at a given x and find the nearest point in the positive direction where f(x) = 0. For simplicity, solutions are only needed in the interval [initial_x, maximum_x] (the maximum is given), but any better reach is desirable. Additionally, a specific precision is not mandatory; I am looking to maximize it, but not at the cost of performance.
While this seems simple, there are a few caveats that make the solution more difficult.
Performance is the first priority, even over some precision. The zero needs to be found in the fewest possible calls to f(x), as this code will be run many times per second.
There are not guaranteed to be any specific number of zeros on this line. There may be zero, one, or many places that the function intersects the x-axis. (This is why a direct binary search will not work.)
The function f(x) cannot be manipulated algebraically, only supporting numerical evaluation at a discrete point. (This is why the solution cannot be found analytically.)
My current strategy is to define a step size that is within an acceptable loss of precision and then test in increments until an interval is found on which there is guaranteed to be at least one zero (in [a,b], a and b are on opposite sides of 0). From there, I use a binary search to narrow down the (more) exact point.
// assuming y != 0
initial_y = f(x);
while (x < maximum_x) {
y = f(x);
// test to see if y has crossed 0
if (initial_y > 0) {
if (y < 0) {
return binary_search(x - step_size, x);
}
} else {
if (y > 0) {
return binary_search(x - step_size, x);
}
}
x += step_size;
}
This has several disadvantages, mainly the fact that there is a significant trade-off between resolution and performance (the smaller step_size is, the better it works but the longer it takes). Is there a more efficient formula or strategy I can take? I thought of using the value of y to scale the step size, but I cannot figure out how to preserve precision while doing that.
The solution can be in any language because I am looking more for a strategy to find the zeros, than a specific program.
(edit:)
The function above is assumed to be continuous.
To clarify the question, I understand that this problem may be impossible to solve exactly. I am just asking for ways to improve the speed or precision of the algorithm. The one I am currently using is working quite well, even though it fails during many edge cases.
For example, a solution that requires fewer steps with similar precision or another algorithm that increases the precision or reliability with some performance impact would both be extremely helpful.
Your problem is essentially impossible to solve in the general case. For example, no algorithm can find the "first" root of sin(1/x), starting from x=0.
A tentative answer is by exponential search, i.e. starting from a small step and increase it following a geometric progression rather than an arithmetic one, until you find a change of sign. But this will fail if the first root is closer than the initial step, or if the first root is followed by a close one.
Without any information on the behavior of f, I would not even try anything (but a "standard" root finder), this is too hopeless ! (But I am sure you do have some information.)

Recursion Time Complexity Definition Confusion

The time complexity of a recursive algorithm is said to be
Given a recursion algorithm, its time complexity O(T) is typically
the product of the number of recursion invocations (denoted as R)
and the time complexity of calculation (denoted as O(s))
that incurs along with each recursion
O(T) = R * O(s)
Looking at a recursive function:
void algo(n){
if (n == 0) return; // base case just to not have stack overflow
for(i = 0; i < n; i++);// to do O(n) work
algo(n/2);
}
According to the definition above I may say that, the time complexity is, R is logn times and O(s) is n. So the result should be n logn where as with mathmetical induction it is proved that the result in o(n).
Please do not prove the induction method. I am asking why the given definition does not work with my approach.
Great question! This hits at two different ways of accounting for the amount of work that's done in a recursive call chain.
The original strategy that you described for computing the amount of work done in a recursive call - multiply the work done per call by the number of calls - has an implicit assumption buried within it. Namely, this assumes that every recursive call does the same amount of work. If that is indeed the case, then you can determine the total work done as the product of the number of calls and the work per call.
However, this strategy doesn't usually work if the amount of work done per call varies as a function of the arguments to the call. After all, we can't talk about multiplying "the" amount of work done by a call by the number of calls if there isn't a single value representing how much work is done!
A more general strategy for determining how much work is done by a recursive call chain is to add up the amount of work done by each individual recursive call. In the case of the function that you've outlined above, the work done by the first call is n. The second call does n/2 work, because the amount of work it does is linear in its argument. The third call does n/4 work, the fourth n/8 work, etc. This means that the total work done is bounded by
n + n/2 + n/4 + n/8 + n/16 + ...
= n(1 + 1/2 + 1/4 + 1/8 + 1/16 + ...)
≤ 2n,
which is where the tighter O(n) bound comes from.
As a note, the idea of "add up all the work done by all the calls" is completely equivalent to "multiply the amount of work done per call by the number of calls" in the specific case where the amount of work done by each call is the same. Do you see why?
Alternatively, if you're okay getting a conservative upper bound on the amount of work done by a recursive call chain, you can multiply the number of calls by the maximum work done by any one call. That will never underestimate the total, but it won't always give you the right bound. That's what's happening here in the example you've listed - each call does at most n work, and there are O(log n) calls, so the total work is indeed O(n log n). That just doesn't happen to be a tight bound.
A quick note - I don't think it would be appropriate to call the strategy of multiplying the total work done by the number of calls the "definition" of the amount of work done by a recursive call chain. As mentioned above, that's more of a "strategy for determining the work done" than a formal definition. If anything, I'd argue that the correct formal definition would be "the sum of the amounts of work done by each individual recursive calls," since that more accurately accounts for how much total time will be spent.
Hope this helps!
I think you are trying to find information about master theorem which is what is used to prove the time complexity of recursive algorithms.
https://en.wikipedia.org/wiki/Master_theorem_(analysis_of_algorithms)
Also, you usually can't determine an algorithms runtime just from looking at it, especially recursive ones. That's why your quick analysis is different than the proof by induction.

How do I efficiently find the maximum value in an array containing values of a smooth function?

I have a function that takes a floating point number and returns a floating point number. It can be assumed that if you were to graph the output of this function it would be 'n' shaped, ie. there would be a single maximum point, and no other points on the function with a zero slope. We also know that input value that yields this maximum output will lie between two known points, perhaps 0.0 and 1.0.
I need to efficiently find the input value that yields the maximum output value to some degree of approximation, without doing an exhaustive search.
I'm looking for something similar to Newton's Method which finds the roots of a function, but since my function is opaque I can't get its derivative.
I would like to down-thumb all the other answers so far, for various reasons, but I won't.
An excellent and efficient method for minimizing (or maximizing) smooth functions when derivatives are not available is parabolic interpolation. It is common to write the algorithm so it temporarily switches to the golden-section search (Brent's minimizer) when parabolic interpolation does not progress as fast as golden-section would.
I wrote such an algorithm in C++. Any offers?
UPDATE: There is a C version of the Brent minimizer in GSL. The archives are here: ftp://ftp.club.cc.cmu.edu/gnu/gsl/ Note that it will be covered by some flavor of GNU "copyleft."
As I write this, the latest-and-greatest appears to be gsl-1.14.tar.gz. The minimizer is located in the file gsl-1.14/min/brent.c. It appears to have termination criteria similar to what I implemented. I have not studied how it decides to switch to golden section, but for the OP, that is probably moot.
UPDATE 2: I googled up a public domain java version, translated from FORTRAN. I cannot vouch for its quality. http://www1.fpl.fs.fed.us/Fmin.java I notice that the hard-coded machine efficiency ("machine precision" in the comments) is 1/2 the value for a typical PC today. Change the value of eps to 2.22045e-16.
Edit 2: The method described in Jive Dadson is a better way to go about this. I'm leaving my answer up since it's easier to implement, if speed isn't too much of an issue.
Use a form of binary search, combined with numeric derivative approximations.
Given the interval [a, b], let x = (a + b) /2
Let epsilon be something very small.
Is (f(x + epsilon) - f(x)) positive? If yes, the function is still growing at x, so you recursively search the interval [x, b]
Otherwise, search the interval [a, x].
There might be a problem if the max lies between x and x + epsilon, but you might give this a try.
Edit: The advantage to this approach is that it exploits the known properties of the function in question. That is, I assumed by "n"-shaped, you meant, increasing-max-decreasing. Here's some Python code I wrote to test the algorithm:
def f(x):
return -x * (x - 1.0)
def findMax(function, a, b, maxSlope):
x = (a + b) / 2.0
e = 0.0001
slope = (function(x + e) - function(x)) / e
if abs(slope) < maxSlope:
return x
if slope > 0:
return findMax(function, x, b, maxSlope)
else:
return findMax(function, a, x, maxSlope)
Typing findMax(f, 0, 3, 0.01) should return 0.504, as desired.
For optimizing a concave function, which is the type of function you are talking about, without evaluating the derivative I would use the secant method.
Given the two initial values x[0]=0.0 and x[1]=1.0 I would proceed to compute the next approximations as:
def next_x(x, xprev):
return x - f(x) * (x - xprev) / (f(x) - f(xprev))
and thus compute x[2], x[3], ... until the change in x becomes small enough.
Edit: As Jive explains, this solution is for root finding which is not the question posed. For optimization the proper solution is the Brent minimizer as explained in his answer.
The Levenberg-Marquardt algorithm is a Newton's method like optimizer. It has a C/C++ implementation levmar that doesn't require you to define the derivative function. Instead it will evaluate the objective function in the current neighborhood to move to the maximum.
BTW: this website appears to be updated since I last visited it, hope it's even the same one I remembered. Apparently it now also support other languages.
Given that it's only a function of a single variable and has one extremum in the interval, you don't really need Newton's method. Some sort of line search algorithm should suffice. This wikipedia article is actually not a bad starting point, if short on details. Note in particular that you could just use the method described under "direct search", starting with the end points of your interval as your two points.
I'm not sure if you'd consider that an "exhaustive search", but it should actually be pretty fast I think for this sort of function (that is, a continuous, smooth function with only one local extremum in the given interval).
You could reduce it to a simple linear fit on the delta's, finding the place where it crosses the x axis. Linear fit can be done very quickly.
Or just take 3 points (left/top/right) and fix the parabola.
It depends mostly on the nature of the underlying relation between x and y, I think.
edit this is in case you have an array of values like the question's title states. When you have a function take Newton-Raphson.

Big O Log problem solving

I have question that comes from a algorithms book I'm reading and I am stumped on how to solve it (it's been a long time since I've done log or exponent math). The problem is as follows:
Suppose we are comparing implementations of insertion sort and merge sort on the same
machine. For inputs of size n, insertion sort runs in 8n^2 steps, while merge sort runs in 64n log n steps. For which values of n does insertion sort beat merge sort?
Log is base 2. I've started out trying to solve for equality, but get stuck around n = 8 log n.
I would like the answer to discuss how to solve this mathematically (brute force with excel not admissible sorry ;) ). Any links to the description of log math would be very helpful in my understanding your answer as well.
Thank you in advance!
http://www.wolframalpha.com/input/?i=solve%288+log%282%2Cn%29%3Dn%2Cn%29
(edited since old link stopped working)
Your best bet is to use Newton;s method.
http://en.wikipedia.org/wiki/Newton%27s_method
One technique to solving this would be to simply grab a graphing calculator and graph both functions (see the Wolfram link in another answer). Find the intersection that interests you (in case there are multiple intersections, as there are in your example).
In any case, there isn't a simple expression to solve n = 8 log₂ n (as far as I know). It may be simpler to rephrase the question as: "Find a zero of f(n) = n - 8 log₂ n". First, find a region containing the intersection you're interested in, and keep shrinking that region. For instance, suppose you know your target n is greater than 42, but less than 44. f(42) is less than 0, and f(44) is greater than 0. Try f(43). It's less than 0, so try 43.5. It's still less than 0, so try 43.75. It's greater than 0, so try 43.625. It's greater than 0, so keep going down, and so on. This technique is called binary search.
Sorry, that's just a variation of "brute force with excel" :-)
Edit:
For the fun of it, I made a spreadsheet that solves this problem with binary search: binary‑search.xls . The binary search logic is in the second data column, and I just auto-extended that.

Resources