Asymptotic upper bound of a function
An upper bound, as the name suggests, puts an upper limit of a function's growth. The upper bound is another function that grows at least as fast as the original function. What is the point of talking about one function in place of another? The function we use is in general a lot more simplified than the actual function for computing running time or space required to process a certain size of input. It is a lot easier to compare simplified functions than to compare complicated functions.
For a function f, we define the notation O, called big O, in the following ways:
- f(x) = O(f(x)).
- If f(x) = O(g(x)), then k f(x) = O(g(x)) for any non-zero constant k.
- For example, 5x3 = O(x3) and 2 log x = O(log x) and -x3 = O(x3) (taking k= -1).
- If f(x) = O(g(x)) and |h(x)|<|f(x)| for all sufficiently large x, then f(x) + h(x) = O(g(x)).
- For example, 5x3 - 25x2 + 1 = O(x3) because for a sufficiently large x, |- 25x2 + 1| = 25x2 - 1 is much less that | 5x3| = 5x3. So, f(x) + g(x) = 5x3 - 25x2 + 1 = O(x3) as f(x) = 5x3 = O(x3).
- We can prove by similar logic that x3 = O( 5x3 - 25x2 + 1).
- if f(x) = O(g(x)) and |h(x)| > |g(x)| for all sufficiently large x, then f(x) = O(h(x)).
- For example, x3 = O(x4), because if x is sufficiently large, x4 > x3.
Note that whenever there is an inequality on functions, we are only interested in what happens when x is large; we don't bother about what happens for small x.
Note
To summarize the above definition, you can drop constant multipliers (rule 2) and ignore lower order terms (rule 3). You can also overestimate (rule 4). You can also do all combinations for those because rules can be applied any number of times.
We had to consider the absolute values of the function to cater to the case when values are negative, which never happens in running time, but we still have it for completeness.
Note
There is something about the sign = that is not usual. Just because f(x) = O(g(x)), it does not mean, O(g(x)) = f(x). In fact, the last one does not even mean anything.
It is enough for all purposes to just know the preceding definition of the big O notation. You can read the following formal definition if you are interested. Otherwise you can skip the rest of this subsection.
The preceding idea can be summarized in a formal way. We say the expression f(x) = O(g(x)) means that positive constants M and x0 exist, such that |f(x)| < M|g(x)| whenever x > x0. Remember that you just have to find one example of M and x0 that satisfy the condition, to make the assertion f(x) = O(g(x)).
For example, Figure 1 shows an example of a function T(x) = 100x2
+2000x+200. This function is O(x2
), with some x0
= 11 and M = 300. The graph of 300x2 overcomes the graph of T(x) at x=11 and then stays above T(x) up to infinity. Notice that the function 300x2 is lower than T(x) for smaller values of x, but that does not affect our conclusion.
To see that it's the same thing as the previous four points, first think of x0 as the way to ensure that x is sufficiently large. I leave it up to you to prove the above four conditions from the formal definition.
I will, however, show some examples of using the formal definition:
- 5x2 = O(x2) because we can say, for example, x0 = 10 and M = 10 and thus f(x) < Mg(x) whenever x > x0, that is, 5x2 < 10x2 whenever x > 10.
- It is also true that 5x2 = O(x3) because we can say, for example, x0 = 10 and M = 10 and thus f(x) < Mg(x) whenever x > x0, that is, 5x2 < 10x3 whenever x > 10. This highlights a point that if f(x) = O(g(x)), it is also true that f(x) = O(h(x)) if h(x) is some functions that grows at least as fast as f(x).
- How about the function f(x) = 5x2 - 10x + 3? We can easily see that when x is sufficiently large, 5x2 will far surpass the term 10x. To prove my point, I can simply say x>5, 5x2> 10x. Every time we increment x by one, the increment in 5x2 is 10x + 1 and the increment in 10x is just a constant, 10. 10x+1 > 10 for all positive x, so it is easy to see why 5x2 is always going to stay above 10x as x goes higher and higher.
In general, any polynomial of the form an
xn
+ an-1
xn-1
+ an-2
xn-2
+ … + a0
= O(xn). To show this, we will first see that a0 = O(1). This is true because we can have x0 = 1 and M = 2|a0|, and we will have |a0| < 2|a0
| whenever x > 1.
Now, let us assume it is true for some n. Thus, an
xn
+ an-1
xn-1
+ an-2
xn-2
+ … + a0
= O(xn). What it means, of course, is that some Mn and x0 exist, such that |an
xn
+ an-1
xn-1
+ an-2
xn-2
+ … + a0
| < Mn
xn whenever x>x0. We can safely assume that x0
>2, because if it is not so, we can simply add 2 to it to get a new x0, which is at least 2.
Now, |an
xn
+ an-1
xn-1
+ an-2
xn-2
+ … + a0
| < Mn
xn implies |an+1
xn+1 + an
xn
+ an-1
xn-1
+ an-2
xn-2
+ … + a0
| ≤ |an+1
xn+1
| + |anxn
+ an-1
xn-1
+ an-2
xn-2
+ … + a0
| < |an+1
xn+1
| + Mn
xn.
This means |an+1
xn+1
| + Mn
xn
> |an
xn
+ an-1
xn-1
+ an-2
xn-2
+ … + a0
|.
If we take Mn+1
= |an+1
| + Mn, we can see that Mn+1
xn+1
= |an+1
| xn+1
+ Mn
xn+1
=|an+1
xn+1
| + Mn
xn+1
> |an+1
xn+1
| + Mn
xn
> |an+1
xn+1
+ an
xn
+ an-1
xn-1
+ an-2
xn-2
+ … + a0
|.
That is to say, |an+1
xn+1
+ an-1
xn-1
+ an-2
xn-2
+ … + a0
|< Mn+1
xn+1 for all x > x0, that is, an+1
xn+1
+ an
xn
+ an-1
xn-1
+ an-2
xn-2
+ … + a0
= O(xn+1
).
Now, we have it true for n=0, that is, a0 = O(1). This means, by our last conclusion, a
1x + a0
= O(x). This means, by the same logic, a2
x2
+ a1
x + a0
= O(x2
), and so on. We can easily see that this means it is true for all polynomials of positive integral degrees.