CS 3343/3341
 Analysis of Algorithms 
Quicksort:
  Recursion Trees and   
Cost Analysis


Introduction: This page uses quicksort to study introductory algorithm execution analysis focusing on recursion trees and cost analysis. Your text covers some of this material on pages 88-93. Here "cost" means the execution time, and, quoting from your text: "in a recursion tree, each node represents the cost of a single subproblem somewhere in the set of recursive function invocations."
Quicksort: Here there is a particularly interesting recursion tree, with each node at each level having two subnodes until the leaf nodes at the bottom. The cost at each level is c * n for a constant c. (This is the cost of doing all the partitioning at a given level -- not a single partition.)


Best Case Quicksort: In the best case, everything will be exactly halved at each stage, giving a recurrence of:

T(n) = 2 T(n/2) + Θ(n)

So the height of the tree will be the same as with any time we repeatedly half n: log2(n) and the total cost will be c * n * log2(n). Thus in the best case this will be a Θ(n log n) algorithm. Here is a picture of the recursion tree:


Click picture or here for full size picture.


Worst Case Quicksort: In the worst case, each partition has just the pivot, with everything else on one side of the other. This worst case is encountered using the book's (non-randomized) algorithm working on an array that is already sorted. Here the recurrence will be:

T(n) = T(n - 1) + T(0)+ Θ(n), and
    T(0) = Θ(1)

Here the tree has height n, and the cost at each level is c*n, so this is a Θ(n2) algorithm.


Unbalanced Quicksort: Your text, on page 176 considers a possible partitioning encountered by quicksort in which the ratios are1/10 and 9/10 at each node, as with the binary search with unequal divisions above. Here the recurrence is:

T(n) = T(9n/10) + T(n/10)+ c n

Here are three cases of repeatedly multiplying n by a constant until we get to 1.

In this case, the shortest depth of a leaf is log10(n), while the largest depth is log10/9(n). See the previous section or the logarithm page for this value. In this bad splitting, the cost is still Θ(n log n).


Click picture or here for full size picture.

The cost at the top level is to do 1 partition, at the next level down it is to do 2 partitions, then 4 partitions, then 8 and so forth. In spite of the increasing number of partitions as we go down the levels, the cost is about the same, since all the partitions at a given level work on the same number of array elements. At the level given by log10(n), the leftmost downward line gets to a leaf node and terminates. Other lines give out as the depth increases, and this is shown in the diagram with "<= cn" as the cost, since parts of the array are sorted and no longer worked on. The deepest line downward is at the far right, with depth log10/9(n). In any case, the algorithm is Θ(n*log(n)).


Revision date: 2012-02-01. (Please use ISO 8601, the International Standard Date and Time Notation.)