CS 3343/3341, Lectures, Spring 2004

CS 3343/3341 Analysis of Algorithms
Spring 2004
Outline of Lecture Topics

Lecture Schedule
Week	Tuesday	Thursday
1	13 January 2004	15 January 2004
2	20 January 2004	22 January 2004
3	27 January 2004	29 January 2004
4	3 February 2004	5 February 2004
5	10 February 2004	12 February 2004
6	17 February 2004	19 February 2004
7	24 February 2004	26 February 2004
8	2 March 2004	4 March 2004
9	9 March 2004	11 March 2004

January 13:

Introduction.
1. Definition of algorithm (from Knuth). It must have the Features:
  1. Finiteness: Will terminate after a finite number of steps.
  2. Definiteness: Each step must be precisely defined.
  3. Input: Zero or more inputs.
  4. Output: One or more outpus.
  5. Effectiveness: Each basic step must be doable.
2. Algorithms are very difficult and sensitive to the most minute changes, in contrast with, say, an electric transformer.
3. Text's discussion of 3 algorithms for gcd (pages 4-6):
  1. Eucid's method. (Takes time c*log₂n.)
  2. Trying all integers from min(m,n) downward. (Takes time c*n.)
  3. Assembling common prime factors. (Takes time 2^log₂n or less.)
4. Binary search (text, pages 133-134):
  1. Basic algorithm. (Takes time log₂n.)
  2. Binary search techniques occur all over the place.
  3. Example: Traveling Salesman Optimization Problem (TS-OPT) (text, page 348).
    1. Find shortest-length round-trip tour through a weighted graph.
    2. No efficient solution known.
  4. Consider apparently simpler: Traveling Salesman Decision Problem (TS-DEC):
    1. Is there a tour of length <= initial D? (Yes or No)
    2. Can use Binary Search Technique to solve TS-OPT efficiently, if we have an efficient solution to TS-DEC
  5. Question: If we just have a "black box" program that gives (Yes/No) answer, the argument above shows how to get the length of the optimal tour. Now using this as a "black box" program that gives the length of the optimal tour, how could we find the actual tour? (In such a way that if the "black box" is efficient, then it will also be efficient to find the actual tour.)

January 15:

More on Algorithms.
1. Specifying an algorithm (text, page 13): Use English, pseudocode, or actual program.
2. First of all an algorithm needs to be correct, that is, it does the correct calculation and is guaranteed to terminate.
3. Secondary consideration is efficiency.
  1. Time efficiency: the most important.
  2. Space efficiency: the other important consideration.
4. Important problem areas: see list, page 19.
5. Data structures for algorithms (text, pages 26-37):
  1. Linear
    1. array
    2. linked list, plus doubly linked, circularly linked, etc.
    3. applications: stacks, queues.
  2. Binary trees, using nodes with data, left child and right child.
  3. General (rooted) trees: clever representation of this and even a forest as a single binary tree.
  4. Graphs (and weighted graphs). Representations (pictures, page 30):
    1. Adjacency matrix.
    2. Adjacency linked list.
  5. Sets: these present more difficult problems in general.

January 20:

The Efficiency of Algorithms. (text, chapter 2)
1. Interested in time and space.
2. Also interested in:
  1. Worst case
  2. Average case
  3. Best case (seldom used)
3. Everything measured in terms of the size of the input, say n.
4. Measures of running time: number of instructions executed, number of steps or iterations.
5. Orders of growth of functions (mainly for measuring time): see Table 2.1 (page 46) and Table 2.2 (page 59).
6. Notations (text, pages 52-55, esp. Figures 2.1, 2.2, 2.3).
  1. Big-Oh notation: a function is bounded above by a constant times another.
  2. Big-Omega notation: a function is bounded below by a constant times another.
  3. Big-Theta notation: a function is bounded above and below by a constant times another.
  4. Little-Oh notation: a function has a smaller order of growth than another function (text, page 58).
7. Using Limits to calculate orders of growth (text pages 57-58).
  1. Basic result (page 58).
  2. Use of L'Hopital's Rule.
  3. Examples 1, 2, and 3, pages 57-57.
8. Algorithms versus Problems:
  1. An Algorithm is a specific way to find a solution to a problem. Usually it is feasible to find the Big-Oh or even Big-Theta performance of an algorithm, often worst-case time. Sometimes one gets average-case time. Less often one finds the space performance. In summary, one often has an upper-bound and a lower bound for the performance of an algorithm, and often they are of the same order.
  2. A Problem requires a solution, and there may be any number of algorithms solving a problem. Given an algorithm that solves a problem, then the Big-Oh performance of the algorithm becomes an upper-bound for all algorithms solving the problem, since one then has a specific solution, so you can do at least as well as this algorithm. However, there may be other, better algorithms.
  3. Only rarely does one have a good lower bound for a problem, that is, only rarely does one have an algorithm that is guaranteed to be as good at solving the problem as any other algorithm. Even if there is a lower bound for a problem, it will often not be of the same order as an upper bound.
9. Simple examples:
  1. MaxElement: O(n) (actually Big-Theta(n)) (text, page 61).
    (Cannot do any better than this.)
  2. UniqueElements: O(n²) (actually Big-Theta(n²) (for this algorithm, text, page 63).
    (Can do this at least in time O(n*log_n).)
  3. MatrixMultiplication: O(n³) (actually Big-Theta(n³) (for this algorithm, see text, page 63).
    (It was always known that MM = O(n³) and MM = Big-Omega(n²), but everyone assumed MM = Big-Theta(n³).
    In 1969, Strassen produced an amazing fact: can do in time O(n^k), for k < 3.
    Your text shows this in Section 4.5, pages 144-146: MM is O(n^log₂7) = O(n^2.807), as shown by Strassen.
    The best result up to now is O(n^2.376), by Coppersmith and Winograd in 1987.)

January 22:

Several Examples.
1. Tower of Hanoi puzzle. (Text, Section 2.4)
  1. Problem: Move n disks from peg 1 to peg 3.
    Specific algorithm for moves: Recursively move n-1 disks from peg 1 to peg 2. Then move largest disk from peg 2 to peg 3. Then recursively move n-1 disks from peg 2 to peg 3.
    It takes M(n) = 2ⁿ -1 moves.
    (Prove by induction.)
  2. So this algorithm and the Tower of Hanoi puzzle are in BigOh of 2ⁿ.
  3. What is the least number of moves required, L(n)?
    (Prove by induction that L(n) = 2ⁿ - 1.)
  4. Your text uses a recurrence relation for M(n):
    Text solves this by back substitution:
    Eventually one gets
2. Fibonacci Numbers. (Text, Section 2.5)
  1. Fibonacci numbers are defined by
  2. Write an infinite series
    Roots of 1 - z - z² are:
    Finally, get (with steps given here)

January 27:

Algorithms for Fibonacci Numbers.
1. First algorithm for F(n): Uses recursive algorithm. Recurrence for the performance is almost the same as for the size of the Fibonacci numbers and its solution is the same, so that the performance is in O(phiⁿ) = O(1.618ⁿ), (and actually Big-Theta). (Text, pages 81-81.) So this is an exponential-time algorithm, very inefficient.
2. Second algorithm for F(n): Uses simple obvious iterative algorithm. Takes n steps. Clearly O(n) and even Big-Theta(n).
3. Third algorithm for F(n): This relies on the formula:
  These are matices, and the power is iterated matrix multiplication. With ordinary matrix multiplication, the use of this formula would still end up taking O(n) time. However, there are much more efficient algorithms to raise an integer to a power, as described next.
O(log n) exponentiation.
1. There are two different algorithms, both given in your test, pages 228-230. We will focus on the second one, called in your text "RightLeftBinary Exponentiation".
2. Study this algorithm. (We will do this in class.)
Brute-force algorithms. (Text, Chapter 3)
1. Justifications for using brute-force. (Text, page 98.)
2. Simple examples of brute-force algorithms:
  1. Raising to power n by just multiplying n times.
  2. Checking all integers to get GCD.
  3. Checking all possible factors to get unique factorization.
3. Simple sorting algorithms:
  1. Selection sort.
  2. Bubble sort.
4. Sequential search of a list.
5. Brute-force string matching (text, pages 104-105).
6. Convex-hull (text, pages 108-111).
7. Exhaustive search through possible solutions to a problem. (We talked about TS, the Traveling Salesman Problem.)

January 29:

Divide and Conquer.
1. Finding Max and Min together.
  1. Start with a set of n numbers.
  2. Count comparisons: assume a comparison takes a lot of time, more than other parts of the algorithm.
  3. Count comparisons to find Max: n - 1.
  4. Count comparisons to find Min (of remaining numbers): n - 2.
  5. Total of C(n) = 2*n - 3 comparisons to find Max and Min.
  6. Divide and conquer algorithm to find Max and Min simultaneously (assumes functions MAX and MIN given that will find max and min of two numbers; if invoked at the same time, they can get the answer with one comparison):
  7. Recurrence relation to count comparisons:
    T(n) = 1, ifn = 2 T(n) = 2*T(n/2) + 2,ifn > 2, n = 2^k, k > 1.
  8. Prove by induction that T(n) = (3/2)*n - 2.
  9. Work through examples with n = 1, 2, 4, 8, 16, getting:
    T(2) = 1, T(4) = 4, T(8) = 10, T(16) = 22, T(32) = 46, ... Compare this with :
    C(2) = 1, C(4) = 5, C(8) = 13, C(16) = 29, C(32) = 61, ...
  10. Both these algorithms are O(n) (and Big-Theta(n)).
2. Sorting by quicksort (which you've already seen in CS 1723).
  1. Go over the quicksort algorithm in the text (pages 127-132). Pay special attention to the pivot element. See that his algorithm is Big-Theta(n²) for an array already in sorted order.
  2. Can choose three elements and take the "median of three" as the pivot (text, page 132).
  3. Better way is randomized quicksort: use pseudorandom or true random numbers to help keep from encountering the worst-case performance of quicksort.
  4. Algorithms to randomize the order of elements in an array.
    1. First method: use another array, filled with random numbers. Sort this second array into order, doing the same steps to the first. (Or include the second array's elements as an extra field in the elements of the first array.) This still leaves the problem of sorting an array, but now both arrays to be sorted are in random order, which quicksort will do efficiently with overwhelming probability.
    2. Second much better method: Randomize directly and very efficiently.
  5. Applications to quicksort: randomize order of input array first. First method above takes twice as long, but the second method is a very fast O(n) pass through the array.
  6. Another randomized quicksort: even better is just to pick the pivot element at random.

February 3:

Divide and Conquer (con't).
1. Strassen's matrix multiplications method: here.

February 5:

Divide and Conquer (con't).

February 10:

Master Theorem.
1. Study the Master Theorem in the book: page 123 and page 427. Do not go over the proof. (We proved one special case while discussing Strassen's method for multiplying matrices.
2. Go over applications:
  1. Number of additions for summation (pages 121-123):
  2. Merge sort (pages 124-126):
  3. Divide-and-Conquer Ordinary Matrix Mult.:
  4. Strassen's Divide-and-conquer Matrix Mult (pages 144-147):
  5. Binary search (pages 133-136):

February 12:

Searching Lists.

Algorithm	Search	Insert	Delete	Comments
	Asymptotic Execution Times (Big-Theta)
Unordered List in an array	`n`	`1`	`n`	Simple
Unordered Linked List	`n`	`1`	`1`	Simple
Ordered List in an array	`log n`	`n`	`n`	If search is common
Binary Search Tree	`log n`	`log n`	With effort	O(n) in worst case
Balanced Tree (AVL)	`log n`	`log n`	`log n`	All are worst case
Hashing	`1`	`1`	`1`	Ugly not in order

Binary search tree of strings: Java.
Binary search tree of doubles: Java.
Hash table insert and delete: Java.
Fibonacci searching: Java.

February 17:

Correctness Proofs and Loop Invariants.
1. Correctness proofs require proof of two things:
  1. The algorithm terminates.
  2. On termination, the algorithm has correctly calculated the answer.
2. The problem with assignment: x = x + 1.
  1. If "=" is ordinary mathematical equals, in what sense or under what circumstance is this true? (Answer: never.)
  2. Rewrite as: x_new = x_old + 1.
  3. Realize that we are dealing with two different x's here: before and after the assignment.
3. Simple exponentiation algorithm:
  This obviously does the correct calculation. But there is a more formal approach.
4. Use loop invariant: z * x^y = X^Y
  1. Prove that loop invariant is true initially.
  2. Prove that after each execution of one iteration of the loop, the loop invariant is still true.
  3. Prove that on termination, the equation for the loop invariant shows that the correct value is being calculated.
  4. Prove that the algorithm terminates.
5. Second example, more complex exp program: Exponentiation (PDF). (See pages 16-18.)
6. GCD algorithm and extended GCD algorithm: GCD (PDF). (See pages 13-16 and page 19.)

February 19:

Heapsort.
1. Basic algorithm is presented in the text, pages 218-225.
2. Start definition of a heap.
3. Then see how to construct a heap from a batch of items: HeapBottomUp in your text, involving repeated applications of "heapify".
4. Combine these pieces into a Heapsort that sorts in place in guaranteed Big-Theta(n log n) time.
5. Use the same concept to give a priority queue that is guaranteed to yield each item with the largest priority in time Big-Theta(log n).

February 24:

Heapsort (con't).
1. Details about heaps and the heapsort algorithm.
  1. A heap with n elements resides in an array A, with indexes from 1 to n inclusive. (In C/C++/Java we ignore the element A[0].)
    There is also a value heapsize(A), which is <= n. (This says that only array elements up to and including heapsize(A) are regarded as part of the heap.)
    Regard this array are a binary tree, where A[1] is the root, and for any node i,
    1. Parent(i) = floor(i/2)
    2. LeftChild(i) = 2*i
    3. RightChild(i) = 2*i + 1
  2. Restoring the heap property, if violated at node i:
  3. Building a heap, starting with array A[1], ..., A[n] and heapsize(A) == n.
  4. Heapsort algorithm, starting with array A[1], ..., A[n]
2. Use of a heap as a priority queue.
  1. Start with an array A holding n priority numbers in locations A[1], ..., A[n].
  2. Use BuildMaxHeap(A) to turn A into a heap.
  3. Repeatedly remove the largest remaining element from A, using the following, which is about the same as one step of the heapsort algorithm:
  4. Another useful algorithm lets us increase a value in the array and restore the heap property:
  5. Finally, we can insert a new element into the array:

February 26:

Dynamic Programming.
1. Memoization. Save answers to sub-problems and try to look up a saved answer first.
  Example: Computing Fibonacci Numbers here.
2. Matrix-chain multiplication.
  Example: Computing the optimal order for multiplication of matrices: here.
  Three-page introduction to this subject (PDF): Page 1, Page 2, Page 3.

March 2:

Review for Mid-term Exam

March 4:

Mid-Term Exam
(answers)

March 9:

Dynamic Programming (con't).
1. Integer Knapsack Problems.
  1. Integer Knapsack Decision Problem (IKDP): Start with n positive integers a₁, a₂, ... , a_n. and with an integer K. Is there a subset of the integers {a_i} which adds up exactly to K?
  2. Integer Knapsack Optimization Problem (IKOP): Start with n positive integers a₁, a₂, ... , a_n. and with an integer K. Find a subset of the integers {a_i} which adds up exactly to K, if such a subset exists.
  3. Dynamic programming solution to the Integer Knapsack Decision Problem (taken from Computers and Intractability by Garey and Johnson):
    Let B be the sum of all the numbers {a_i}. For each 1 <= i <= n and each 0 <= j <= B, let t(i, j) denote the truth value (true of false) of the statement "there is a subset of {a₁, a₂, ..., a_i} which adds up exactly to j." Make up a table with a row for each value of i and a column for each value of j. Then fill in the first row, using the fact that t(1, j) = T if and only if either j = 0 or j = a₁. For subsequent values of i and j, the entry t(i, j) in row i has value T if and only if either
    or
    When we are done filling in the entire table, check the entry t(n, K). The answer to the decision problem is "yes" if and only if this last entry is T.
  4. To see how to solve the corresponding Integer Knapsack Optimization Problem, see Assignment 8.
  5. This dynamic programming solution to IKDP depends on using integers that are "not too big", since the size of the integers is also the order of size of the main table. So we have a BigTheta(n*B) algorithm, where there are n numbers and B is the sum of the numbers. For the general IKOP (with large integers) there is no known efficient algorithm.

March 11:

Dynamic Programming (con't).
1. Optimal Binary Search Trees.
  1. See the text, pages 289-294 for the algorithm to do this.
  2. Initial example of the five possible binary search trees with three nodes here in PDF. Each of them is a possible optimal tree, depending on the probabilities, from a chart given here in PDF (hard to read) or here in GIF (with a misprint: interchange I and II with IV and V).
  3. A binary search tree obtained by inserting the 31 most frequent English words inserted in decreasing order of frequency here in PDF.
  4. An optimal binary search tree using the 31 most frequent English words and their frequencies here in PDF.

CS 3343/3341 Analysis of Algorithms Spring 2004 Outline of Lecture Topics

CS 3343/3341 Analysis of Algorithms
Spring 2004
Outline of Lecture Topics