To simplify a big O expression
We omit all constants
We ignore lower powers of n
For example:
O(n + 5) = O(n)
O(n² + 6n + 7) = O(n²)
O(6n1/3 + n1/2 + 7) = O(n1/2)
Am I right in these examples?
1. We omit all constants
Well, strictly speaking, you don't omit all constants, only the outermost multiplicaive constant. That means O(cf(n)) = O(f(n)). Additive constants are fine too, since
f(n) < f(n)+c < 2f(n) starting with some n, therefore O(f(n)+c) = O(f(n)).
But you don't omit constants inside composite functions. Might be done sometimes (O(log(cn)) or even O(log(n^c)) for instance), but not in general. Consider for example 2^2n, it might be tempting to drop the 2 and put this in O(2^n), which is wrong.
2. We ignore lower powers of n
True, but remember, you don't always work with polynomial functions. You can generally ignore any added asymptotically lower functions. Say you have f(n) and g(n), when g(n) = O(f(n)), then O(f(n) + g(n)) = O(f(n)).
You cannot do this with multiplication.
You're almost right. The second rule should be that you ignore all but the term with the largest limit as n goes towards infinity. That's important if you have terms that are not powers of n, like logs or other mathematical functions.
It's also worth being aware that big O notation sometimes covers up important other details. An algorithm that is O(n log n) will have better performance than one that is O(n^2), but only if the input is large enough for those most terms to dominate the running time. It may be that for the sizes inputs you actually have to deal with in a specific application, the O(n^2) algorithm actually performs better!
Related
Ok So I'm pretty new to CS and was recently learning about Big-O, Theta, and Omega as well as the master theorem and in lecture I saw that this was not the case for some reason and was wondering why is that?
Although both O(n) and T(n) use capital letters on the outside and lower-case n in the middle, they represent fundamentally different concepts.
If you’re analyzing an algorithm using a recurrence relation, it’s common to let T(n) denote the amount of time it takes for the algorithm to complete on an input of size n. As a result, we wouldn’t expect T(n) to be the same as T(n-1), since, in most cases, algorithms take longer to run when you give them larger inputs.
More generally, for any function f, if you wanted to claim that f(n) = f(n-1), you’d need to explain why you could assume this because this generally isn’t the case.
The tricky bit here is that when we write O(n), it looks like we’re writing a function named O and passing in the argument n, but the notation means something totally different. The notation O(n) is a placeholder for “some function that, when the inputs gets really big, is bounded from above by a multiple of n.” Similarly, O(n-1) means “some function that, when the inputs get really big, is bounded from above by a multiple of n-1.” And it happens to be the case that any function that’s bounded above by a multiple of n is also bounded from above by a multiple of n-1, which is why O(n) and O(n-1) denote the same thing.
Hope this helps!
I am trying to figure out the runtime and space complexity of the algorithm below.
Some say that the runtime complexity of this is O(n!) and I am guessing it is because there are n! recursive calls for a recursive algorithm that solves for a n*n matrix. But I am not sure if I am right.
Also, is the space complexity also n!?
It might help to write out an explicit recurrence relation that governs the runtime of a straightforward implementation of the recursive algorithm. Notice that, in working on an n × n matrix, evaluating the sum requires making n recursive calls on matrices of size (n - 1) × (n - 1). Each recursive call requires about (n - 1)2 additional time to set up, since we need to extract a submatrix of that size from the original matrix, so the total per-call overhead of the algorithm would be Θ(n3) because we’re doing quadratic work linearly many times. That means that our work done is roughly
T(n) = nT(n - 1) + n3.
Completely ignoring the cubic term here, notice that expanding out the recursion will have the following effect:
T(n) = nT(n - 1) + ...
= n(n-1)T(n-2) + ...
= n(n-1)(n-2)T(n-3) + ...
and eventually we’ll get an n! term showing up, plus a bunch of extra terms from the cubic. So the work done here is at least Ω(n!), and probably a lot more once we factor in the cubic term.
As for the space complexity - when working with the space complexity, remember that once one branch of the recursion terminates we can reuse the space that branch was using. This means that we only really need to look at any one branch to see how much space is needed.
With a naive implementation of this summation where we explicitly compute the submatrices for the recursive calls, we’ll need space to store one matrix of size n × n, one of size (n-1) × (n-1), one of size (n-2) × (n-2), etc. That space usage sums up to Θ(n3).
There are a bunch of other algorithms you can use to compute determinants in much less time and space. Some are based on Gaussian elimination and run in time O(n3), for example.
How do I find the answer for: n! = Θ( )?
Even Big O is enough. All clues I found are complex math ideas.
What would be the correct approach to tackle this problem? recursion tree seems too much of a work
the goal is to compare between n! and n^logn
Θ(n!) is a perfectly fine, valid complexity, so n! = Θ(n!).
As Niklas pointed out, this is actually true for every function, although, for something like
6x² + 15x + 2, you could write Θ(6x² + 15x + 2), but it would generally be preferred to simply write Θ(x²) instead.
If you want to compare two functions, simply plotting it on WolframAlpha might be considered sufficient to see that Θ(n!) functions grow faster.
To mathematically determine the result, we can take the log of both, giving us log (n!) and log nlog n = log n . log n = (log n)2.
Then, since log(n!) = Θ(n log n), and n log n > (log n)2 for any large n, we could derive that Θ(n!) grows faster.
The derivation is perhaps non-trivial, and I'm slightly unsure whether it's actually possible, but we're a bit beyond the scope of Stack Overflow already (try the Mathematics site if you want more details).
If you want some sort of "closed form" expressions, you can get n! = Ω((sqrt(n/2))^n) and n! = O(n^n). Note sure those are more useful.
To derive them, see that (n/2)^(n/2) < n! < n^n.
To compare against n^(log n), use limit rules; you may also want to use n = e^(log n).
I have been researching the log-sum-exp problem. I have a list of numbers stored as logarithms which I would like to sum and store in a logarithm.
the naive algorithm is
def naive(listOfLogs):
return math.log10(sum(10**x for x in listOfLogs))
many websites including:
logsumexp implementation in C?
and
http://machineintelligence.tumblr.com/post/4998477107/
recommend using
def recommend(listOfLogs):
maxLog = max(listOfLogs)
return maxLog + math.log10(sum(10**(x-maxLog) for x in listOfLogs))
aka
def recommend(listOfLogs):
maxLog = max(listOfLogs)
return maxLog + naive((x-maxLog) for x in listOfLogs)
what I don't understand is if recommended algorithm is better why should we call it recursively?
would that provide even more benefit?
def recursive(listOfLogs):
maxLog = max(listOfLogs)
return maxLog + recursive((x-maxLog) for x in listOfLogs)
while I'm asking are there other tricks to make this calculation more numerically stable?
Some background for others: when you're computing an expression of the following type directly
ln( exp(x_1) + exp(x_2) + ... )
you can run into two kinds of problems:
exp(x_i) can overflow (x_i is too big), resulting in numbers that you can't add together
exp(x_i) can underflow (x_i is too small), resulting in a bunch of zeroes
If all the values are big, or all are small, we can divide by some exp(const) and add const to the outside of the ln to get the same value. Thus if we can pick the right const, we can shift the values into some range to prevent overflow/underflow.
The OP's question is, why do we pick max(x_i) for this const instead of any other value? Why don't we recursively do this calculation, picking the max out of each subset and computing the logarithm repeatedly?
The answer: because it doesn't matter.
The reason? Let's say x_1 = 10 is big, and x_2 = -10 is small. (These numbers aren't even very large in magnitude, right?) The expression
ln( exp(10) + exp(-10) )
will give you a value very close to 10. If you don't believe me, go try it. In fact, in general, ln( exp(x_1) + exp(x_2) + ... ) will give be very close to max(x_i) if some particular x_i is much bigger than all the others. (As an aside, this functional form, asymptotically, actually lets you mathematically pick the maximum from a set of numbers.)
Hence, the reason we pick the max instead of any other value is because the smaller values will hardly affect the result. If they underflow, they would have been too small to affect the sum anyway, because it would be dominated by the largest number and anything close to it. In computing terms, the contribution of the small numbers will be less than an ulp after computing the ln. So there's no reason to waste time computing the expression for the smaller values recursively if they will be lost in your final result anyway.
If you wanted to be really persnickety about implementing this, you'd divide by exp(max(x_i) - some_constant) or so to 'center' the resulting values around 1 to avoid both overflow and underflow, and that might give you a few extra digits of precision in the result. But avoiding overflow is much more important about avoiding underflow, because the former determines the result and the latter doesn't, so it's much simpler just to do it this way.
Not really any better to do it recursively. The problem's just that you want to make sure your finite-precision arithmetic doesn't swamp the answer in noise. By dealing with the max on its own, you ensure that any junk is kept small in the final answer because the most significant component of it is guaranteed to get through.
Apologies for the waffly explanation. Try it with some numbers yourself (a sensible list to start with might be [1E-5,1E25,1E-5]) and see what happens to get a feel for it.
As you have defined it, your recursive function will never terminate. That's because ((x-maxlog) for x in listOfLogs) still has the same number of elements as listOfLogs.
I don't think that this is easily fixable either, without significantly impacting either the performance or the precision (compared to the non-recursive version).
Find the big-O running time for each of these functions:
T(n) = T(n - 2) + n²
Our Answers: n², n³
T(n) = 3T(n/2) + n
Our Answers: O(n log n), O(nlog₂3)
T(n) = 2T(n/3) + n
Our Answers: O(n log base 3 of n), O(n)
T(n) = 2T(n/2) + n^3
Our Answers: O(n³ log₂n), O(n³)
So we're having trouble deciding on the right answers for each of the questions.
We all got different results and would like an outside opinion on what the running time would be.
Thanks in advance.
A bit of clarification:
The functions in the questions appear to be running time functions as hinted by their T() name and their n parameter. A more subtle hint is the fact that they are all recursive and recursive functions are, alas, a common occurrence when one produces a function to describe the running time of an algorithm (even when the algorithm itself isn't formally using recursion). Indeed, recursive formulas are a rather inconvenient form and that is why we use the Big O notation to better summarize the behavior of an algorithm.
A running time function is a parametrized mathematical expression which allows computing a [sometimes approximate] relative value for the running time of an algorithm, given specific value(s) for the parameter(s). As is the case here, running time functions typically have a single parameter, often named n, and corresponding to the total number of items the algorithm is expected to work on/with (for e.g. with a search algorithm it could be the total number of records in a database, with a sort algorithm it could be the number of entries in the unsorted list and for a path finding algorithm, the number of nodes in the graph....). In some cases a running time function may have multiple arguments, for example, the performance of an algorithm performing some transformation on a graph may be bound to both the total number of nodes and the total number of vertices or the average number of connections between two nodes, etc.
The task at hand (for what appears to be homework, hence my partial answer), is therefore to find a Big O expression that qualifies the upper bound limit of each of running time functions, whatever the underlying algorithm they may correspond to. The task is not that of finding and qualifying an algorithm to produce the results of the functions (this second possibility is also a very common type of exercise in Algorithm classes of a CS cursus but is apparently not what is required here.)
The problem is therefore more one of mathematics than of Computer Science per se. Basically one needs to find the limit (or an approximation thereof) of each of these functions as n approaches infinity.
This note from Prof. Jeff Erikson at University of Illinois Urbana Champaign provides a good intro to solving recurrences.
Although there are a few shortcuts to solving recurrences, particularly if one has with a good command of calculus, a generic approach is to guess the answer and then to prove it by induction. Tools like Excel, a few snippets in a programming languages such as Python or also MATLAB or Sage can be useful to produce tables of the first few hundred values (or beyond) along with values such as n^2, n^3, n! as well as ratios of the terms of the function; these tables often provide enough insight into the function to find the closed form of the function.
A few hints regarding the answers listed in the question:
Function a)
O(n^2) is for sure wrong:
a quick inspection of the first few values in the sequence show that n^2 is increasingly much smaller than T(n)
O(n^3) on the other hand appears to be systematically bigger than T(n) as n grows towards big numbers. A closer look shows that O(n^3) is effectively the order of the Big O notation for this function, but that O(n^3 / 6) is a more precise notation which systematically exceed the value of T(n) [for bigger values of n, and/or as n tends towards infinity] but only by a minute fraction compared with the coarser n^3 estimate.
One can confirm that O(n^3 / 6) is it, by induction:
T(n) = T(n-2) + n^2 // (1) by definition
T(n) = n^3 / 6 // (2) our "guess"
T(n) = ((n - 2)^3 / 6) + n^2 // by substitution of T(n-2) by the (2) expression
= (n^3 - 2n^2 -4n^2 -8n + 4n - 8) / 6 + 6n^2 / 6
= (n^3 - 4n -8) / 6
= n^3/6 - 2n/3 - 4/3
~= n^3/6 // as n grows towards infinity, the 2n/3 and 4/3 factors
// become relatively insignificant, leaving us with the
// (n^3 / 6) limit expression, QED