# EECS 586, Winter 2006

(This section will be updated as we go and may include more material than originally present on the top-level webpage.)

## Previous discussion topics

• Design element of Divide-and-conquer and Recursion and analysis technique of Solving Recurrences to determine runtime. CLRS 2.3 and CLRS 4. A recursion can be viewed as a special case of a reduction.
• Design element of Iteration and corresponding analysis technique of using Loop Invariants to prove correctness. CLRS 2.1 (more detail than first lecture). Loop invariants are also useful in proving efficiency.

Proving correctness using a loop invariant is similar to mathematical induction. Some significant changes are that, when showing the maintenance part of the loop invariant, we may assume that the loop test passes, and there's an explicit termination condition to show, for which we may assume that the loop test fails.

• Big-O, Big-Omega, etc., notation. CLRS 3.
• Some recurrences can be solved automatically, using tools like `maple` or `mathematica`. Much of the theory of differential equations can be applied to solving recurrences. It is sometimes also useful to look for patterns by computing the values explicitly, either top-down or bottom-up.
• We made a special focus on unequal divide-and-conquer, which often arises in practice and are sometimes harder to solve automatically. Often we break a problem of size n into two problems of sizes j and k that sum to n, and we are interested in how unbalanced the split is. The closer to balanced, the better (leads to better runtimes).
• We studied heaps as an example of a data structure, that might be used to implement the priority queue abstract data type. We stressed the difference between the abstract data type and the implementation; this distinction is similar to the distinction between function declaration and definition in C. In a data structure, there is tension among the various supported methods; making one method faster may make another slower or increase the space needed in the structure.
• Probability (CLRS 5 and Appendix C). Expectation and variance. The expected cost of an algorithm and, using the Markov inequality, bounding the probability that a cost exceeds its expected value. Markov inequality applied to the square of X-E[X], to bound the variance.
• Wrap-up of variance. The birthday paradox: n days in the year and k people in a room. Find k as a function of n such that the probability of at least one pair among the k having the same birthday is around 1/2. To do this, we let X be the number of of pairs with a birthday surprise, which can be written as a sum over indicator random variables for individual pairs. We showed
• If k is less than sqrt(n)/10, then E[X] is much less than 1. By Markov, the probability that X is at least one is much less than 1/2.
• If k is greater than 10sqrt(n), then E[X] is much greater than 1 and var(X) is much less than (E[X])^2. It follows that, with high probability, X is close to its expectation, which is much greater than 1. That is, with probability much greater than 1/2, X is at least 1.
From this, it follows that, when k is about sqrt(n), we have probability around 1/2 of at least 1 birthday surprise.

Brief discussion of Quicksort and order statistics. We gave an analysis of Quicksort on a random input, showing that the expected number of comparisons is O(n log(n)). Next, we gave a few different variations, in which we used a randomized algorithm to find a good pivot. We tried taking k random pivots. First we showed that the probability that ALL k tries are bad is exponentially small in k. In this case, we could test each pivot and use one that works. We then showed that the probability that k/2 or more are bad is exponentially small in k. In this case, we could use the median pivot, since, if the median pivot is bad, then at least half of the tries are bad, and that's a low probability event.

• Hashing. CLRS 11. A hash table is an efficient data structure for insert/delete/search operations. The space needed for the structure can be little more than the O(n) needed for the n stored items and the expected search time can be O(1). We constructed a universal hash family by combining a family of pairwise-independent permutations with an arbirary hash function that maps the universe nearly uniformly into a hash table of the desired size. We showed how to build a perfect hash table, that supports insertions followed by searches and uses expected space O(n) and worst-case time O(1).
• Dynamic Programming. CLRS 15.2, 15.4, 15.5.
• Greedy Algorithms. Example from CLRS 16.3.
• Amortized analysis, CLRS 17.
• Binary search trees from CLRS 12 and taste of CLRS 13.
• Next, Minimum Spanning Trees from CLRS 23. These are further examples of greedy algorithms.
• Single-source shortest path, CLRS 24.
• NP-Completeness, for several classes. For March 28, please read Garey and Johnson chapters 1 and 2 at a high level. For example, in this class, we want to get a feel for Cook's theorem says but we won't focus on the details.
• Continue reading from GJ chapter 3 at a high level or the equivalent in CLRS, if you prefer. We will talk briefly about minesweeper when going through constructions of gadgets. See Richard Kaye's minesweeper page.