CS 3343/3341: Recitation 5, Heapsort, Median, Etc.

Introduction
to the Design and Analysis of Algorithms

(Textbook. Click for information.)

CS 3343/3341
Analysis of Algorithms
Spring 2012
Recitation 5
Heapsort, Median, Etc.
    Week 5: Feb 14-16
Due (on time): 2012-02-20  23:59:59
Due (late):        2012-02-24  23:59:59

Recitation 5 should be submitted following directions at: submissions with deadlines

2012-02-20 23:59:59 (that's Monday, 20 February 2012, 11:59:59 pm) for full credit.
2012-02-24 23:59:59 (that's Friday, 24 February 2012, 11:59:59 pm) for 75% credit.

Recitation 5 should be submitted following directions at: submissions with deadlines 2012-02-20 23:59:59 (that's Monday, 20 February 2012, 11:59:59 pm) for full credit. 2012-02-24 23:59:59 (that's Friday, 24 February 2012, 11:59:59 pm) for 75% credit.

Median of an Array.

The median of an array is defined to be the value of the "middle" element, if there are an odd number of elements, and otherwise the average of the two middle elements. Thus the median of {18, 14, 23, 11, 19} is 18. For simplicity right now, we'll assume the array has an odd number of elements and that the array starts with A[0].

Give a simple Θ(n*log(n)) algorithm to determine the median.
Consider an array with 13 elements: {16, 19, 9, 5, 13, 8, 7, 4, 21, 2, 6, 11, 18}. What is its median?

We are going to apply your text's partition algorithm to this array and then repeatedly to subarrays of it, more or less as with quicksort. At each stage, however, we focus on only one side of the partition and ignore the other side. Here is an example. (Each stage is the result of an entire application of Partition, all its steps.) The middle location is marked with a hat ^.

\| ^ \| 0 1 2 3 4 5 6 7 8 9 10 11 12 -----------\|------------------------------------------------------------------ Array \| 16 19 9 5 13 8 7 4 21 2 6 11 18 \| Partition 1\| 16 9 5 13 8 7 4 2 6 11 18 19 21 \| Subarray 1 \| 16 9 5 13 8 7 4 2 6 11 (18) (19) (21) \| Partition 2\| 9 5 8 7 4 2 6 11 16 13 (18) (19) (21) \| Subarray 2 \| 9 5 8 7 4 2 6 (11) (16) (13) (18) (19) (21) \| Partition 3\| 5 4 2 6 9 8 7 (11) (16) (13) (18) (19) (21) \| Subarray 3 \| (5) (4) (2) (6) 9 8 7 (11) (16) (13) (18) (19) (21) \| Partition 4\| (5) (4) (2) (6) 7 8 9 (11) (16) (13) (18) (19) (21) \| Subarray 3 \| (5) (4) (2) (6) (7) 8 9 (11) (16) (13) (18) (19) (21) \| Partition 5\| (5) (4) (2) (6) (7) 8 9 (11) (16) (13) (18) (19) (21) \|

           |                                  ^
           |    0    1    2    3    4    5    6    7    8    9   10   11   12 
-----------|------------------------------------------------------------------
Array      |   16   19    9    5   13    8    7    4   21    2    6   11   18 
           |
Partition 1|   16    9    5   13    8    7    4    2    6   11   18   19   21 
           |
Subarray 1 |   16    9    5   13    8    7    4    2    6   11  (18) (19) (21)
           |
Partition 2|    9    5    8    7    4    2    6   11   16   13  (18) (19) (21)
           |
Subarray 2 |    9    5    8    7    4    2    6  (11) (16) (13) (18) (19) (21)
           |
Partition 3|    5    4    2    6    9    8    7  (11) (16) (13) (18) (19) (21)
           |
Subarray 3 |   (5)  (4)  (2)  (6)   9    8    7  (11) (16) (13) (18) (19) (21)
           |
Partition 4|   (5)  (4)  (2)  (6)   7    8    9  (11) (16) (13) (18) (19) (21)
           |
Subarray 3 |   (5)  (4)  (2)  (6)  (7)   8    9  (11) (16) (13) (18) (19) (21)
           |
Partition 5|   (5)  (4)  (2)  (6)  (7)   8    9  (11) (16) (13) (18) (19) (21)
           |

After each partition, we denote the pivot in red. On each side of the pivot are the two subarrays of the partition. (One or both could be empty.) In each case we ignore one of the subarrays and focus on the other. The numbers in parentheses are ones that are ignored. Subarrays 1 and 2 are the left side of the previous partition, while subarrays 3 and 4 are the right side of the previous partition.

Here are results of three examples: a sorted array, one sorted in reverse order, and one that finishes quickly: examples.

What determines which side of the partition we keep and which side we ignore above?
The algorithm continues repeatedly until the new pivot is located in the middle element. This could occur early on. Give a simple alteration of the array above so that the algorithm terminates after the first partition. (Keep the array the same size and mostly the same.)
Give an argument that the algorithm terminates with the correct median value.

This is a remarkable algorithm. Many years ago researches were looking for a Θ(n) algorithm to find the median. Despite a lot of effort, no one could find one, until this simple and elegant algorithm was discovered.

This algorithm has expected performance of Θ(n), assuming a random pivot is chosen at each stage. (The proof of this average performance is beyond the scope of this course [text, pages 215-219.) Assume that the subarray selected at each stage is of length at most 3/4 times the length of the previous subarray. Give a recurrence relation for this case. Using the recursion tree approach, argue carefully that in this special case this is a Θ(n) algorithm.
Write your own program from scratch to find the median of an array with an odd number of elements as above. You can use the code for partitioning found in quicksort. I only want code that will work for this particular instance (array of an odd number of integers), and I don't want you to adapt a median-finding algorithm from some other source. Test your program first with the array above, and then with: {43,22,51,19,11,67,13,36,32,49,28,15,38}. (I worked hard so that the output of my program didn't look too (searching for adjective here ...) You don't need to do this.)
Suppose the array length is anything, either odd or even, and suppose you want to find the value of the ith element in the sorted array, for i = 0, 1, 2, ..., A.length-1. (In the examples above, we were finding the 6th element's value.)
What are you finding when i = 0?
What are you finding when i = A.length-1?
What changes would be needed to the intuitive description of this algorithm in order to handle this more general case?

Weird Topic: 7 Fibonacci Matrices, Part 2 (Incomplete).

The above "weird topic" page is incomplete. The directions are: Finish the page up by adding to the end, so that you get something "weird", or at least "interesting", or at least something that you didn't already know.

Heapsort: See heaps (at the end of this page are links to illustrations of various heap operations, using the same data as in your book), and heapsort (in Java). [Text, pages 151-162.] Following your text, we ignore the 0th element of arrays in Java when working with heaps.

Text, Exercise 6.1-6 on page 154. [Is the array with values {23,17,14,6,13,10,1,5,7,12} a max-heap? If not, why not? Note that these numbers are in array locations A[1], A[2], ... A[10].]
Suppose we used the zero element of our arrays, rather than ignoring it. Give new parent, left and right functions that would work in this case. [But we are not using the zero element in everything below.]
Text, Exercise 6.2-1 on page 156. [Illustrate the operation of Max-Heapify(A,3) on the array A = {27,17,3,16,13,10,1,5,7,12,4,8,9,0}. Is the result a max-heap? If not, why not?
Text, Exercise 6.2-5 on page 156. [Rewrite the code for Max-Heapify so that it uses an iterative control structure (a loop) instead of recursion. This is an example of removing tail-recursion. It's mostly just a matter of giving the second parameter a new value while you go around a loop again.]
Suppose numbers {26, 5, 77, 1, 85, 11, 59, 15, 48, 19} are stored in array locations A[1], ..., A[10]. Build A into a max-heap using the algorithm Build-Max-Heap from the text [page 157] or buildMaxHeap in the program referred to above. (They are the same.) Notice that this starts by calling Max-Heapify(A,5) from the text [page 154] or maxHeapify(A,5) from my program. Then proceed down from 5 to 4 to 3 to 2 to 1. Why not start with 6 instead of 5 above? Show that in this case you get the wrong answer if you start at 1 and work your way up to 5. [This is a little awkward to show in a .txt file, but you can show the action at each step and the resulting array.]
Text, Exercise 6.3-2 on page 159. [Show that the algorithm works in the case above if you start with 6 instead of 5 and work your way down to 1. Show that the algorithm fails in the case above if you start at 1 and work your way up to 5.]
What property does the magic node floor(A.length/2) have? (Notice that in Java, since we have a zero element also, this would be floor((A.length - 1)/2).)
Text, Exercise 6.4-1 on page 160. [Illustrate the operation of heapsort on the array A = {5,13,2,25,7,17,20,8,4}. (Here again, A[1] = 5.)]

Revision date: 2012-02-13. (Please use ISO 8601, the International Standard.)