CS 3343/3341
 Analysis of Algorithms 
Spring 2012
  Recursion Trees and   
Cost Analysis


Introduction: This page covers introductory algorithm execution analysis focusing on recursion trees and cost analysis. Your text covers some of this material on pages 88-93. Here "cost" means the execution time, and, quoting from your text: "in a recursion tree, each node represents the cost of a single subproblem somewhere in the set of recursive function invocations."
Binary Search: In a recursive version of binary search, we start with n items to search, and at each level, the range of the search is halved. At a given level, the cost is bounded by a fixed constant c. Here the recurrence is:

T(n) = T(n/2) + c

Each level has a single node with cost c. Thus the cost is bounded by (Number of levels)*C.

We can determine the number of levels by deciding how many iterations of halving n it takes to get down to 1. Ignoring the rounding of integers, we are looking for an integer i such that (n / 2i) = 1. We'll get a real number i, but again we'll be rounding to an integer. Take the log base two of each side to get a series of equations (all logs below are base two):

(n / 2i ) = 1, or taking logs

log[n / 2i] = log(1), or since log(1) = 0

log(n) - log(2i ) = 0, or

log(n) = i * log(2), or since log(2) = 1

i = log(n).

We've been saying this all along: that log2(n) steps are needed to finish binary search in the worst case (the item may be found early).


Binary Search with Unequal Divisions: Suppose we altered binary search to divide the remaining interval at each stage into 1/10 to the left and 9/10 to the right. In the worst case, the item being sought is always on the right, and here the recurrence is:

T(n) = T(9n/10) + c

As before the (maximum of worst-case) number of levels will be given by the number of times we need to divide n by a power of (10/9) to get to 1. (Or the number of times we need to multiply n by 9/10 to get to 1.) This is the value of i in the equation (n / (10/9)i) = 1.

[n / (10/9)i ] = 1, or taking logs

log[n / (10/9)i ] = log(1), or

log(n) - log[(10/9)i] = log(1), or since log(1) = 0

log(n) = i* log(10/9) = 0, or

i = log(n) / log(10/9), or

i = log(n) / 0.15200, or

i = 6.5788 * log(n)

Here log2(10/9) = ln(10/9) / ln(2) = 0.10536 / 0.693147 = 0.15200. We can get logs base two from natural logs by dividing the natural log by ln(2) = 0.693147).

Thus we see that binary search using these unequal divisions has only increased the worst case time by a constant factor of 6.5788.


Quicksort: Here there is a more interesting recursion tree, with each node at each level having two subnodes until the leaf nodes at the bottom. The cost at each level is c * n for a constant c. (This is the cost of doing all the partitioning at a given level -- not a single partition.)

Best Case Quicksort: In the best case, everything will be exactly halved at each stage, giving a recurrence of:

T(n) = 2 T(n/2) + Θ(n)

So the height of the tree will be the same as with binary search: log2(n) and the total cost will be c * n * log2(n). Thus in the best case this will be a Θ(n log n) algorithm.

Worst Case Quicksort: In the worst case, each partition has just the pivot, with everything else on one side of the other. This worst case is encountered using the book's (non-randomized) algorithm working on an array that is already sorted. Here the recurrence will be:

T(n) = T(n - 1) + T(0)+ Θ(n), and
    T(0) = Θ(1)

Here the tree has height n, and the cost at each level is c*n, so this is a Θ(n2) algorithm.

Unbalanced Quicksort: Your text, on page 176 considers a possible partitioning encountered by quicksort in which the ratios are1/10 and 9/10 at each node, as with the binary search with unequal divisions above. Here the recurrence is:

T(n) = T(9n/10) + T(n/10)+ c n

In this case, the shortest depth of a leaf is log10(n), while the largest depth is log10/9(n). See the previous section or the logarithm page for this value. In this bad splitting, the cost is still Θ(n log n).


Click picture or here for full size picture.

The cost at the top level is to do 1 partition, at the next level down it is to do 2 partitions, then 4 partitions, then 8 and so forth. In spite of the increasing number of partitions as we go down the levels, the cost is about the same, since all the partitions at a given level work on the same number of array elements. At the level given by log10(n), the leftmost downward line gets to a leaf node and terminates. Other lines give out as the depth increases, and this is shown in the diagram with "<= cn" as the cost, since parts of the array are sorted and no longer worked on. The deepest line downward is at the far right, with depth log10/9(n). In any case, the algorithm is Θ(n*log(n)).


Revision date: 2012-02-01. (Please use ISO 8601, the International Standard Date and Time Notation.)