1、The Selection,Algorithm : Design & Analysis 8,In the last class,Heap Structure and Patial Order Tree Property The Strategy of Heapsort Keep the Partial Order Tree Property after the maximal element is removed Constructing the Heap Complexity of Heapsort Accelerated Heapsort,The Selection,Finding max
2、 and min Finding the second largest key Adversary argument and lower bound Selection Problem Median A Linear Time Selection Algorithm Analysis of Selection Algorithm A Lower Bound for Finding the Median,The Selection Problem,Problem: Suppose E is an array containing n elements with keys from some li
3、nearly order set, and let k be an integer such that 1kn. The selection problem is to find an element with the kth smallest key in E. A Special Case Find the max/min k=n or k=1,Lower Bound of Finding the Max,For any algorithm A that can compare and copy numbers exclusively, in the worst case, A cant
4、do fewer than n-1 comparisons to find the largest entry in an array with n entries. Proof: an array with n distinct entries is assumed. We can exclude a specific entry from being the largest entry only after it is determined to be “loser” to at least one entry. So, n-1 entries must be “losers” in co
5、mparisons done by the algorithm. However, each comparison has only one loser, so at least n-1 comparisons must be done.,Decision Tree and Lower Bound,Since the decision tree for the selection problem must have at least n leaves, the height of the tree is at least lgn. Its not a good lower bound.,3,3
6、,There are more than n leaves!,Example: n=4,Finding max and min,The strategy Pair up the keys, and do n/2 comparisons(if n odd, having En uncompared); Doing findMax for larger key set and findMin for small key set respectively (if n odd, En included in both sets) Number of comparisons For even n: n/
7、2+2(n/2-1)=3n/2-2 For odd n: (n-1)/2+2(n-1)/2+1-1)=3n/2-2,Unit of Information,That x is max can only be known when it is sure that every key other than x has lost some comparison. That y is min can only be known when it is sure that every key other than y has win some comparison. Each win or loss is
8、 counted as one unit of information, then any algorithm must have at least 2n-2 units of information to be sure of specifying the max and min.,Adversay Strategy,The principle: let the key win if it never lose, or, let the key lose if it never win, andchange one value if necessary,Lower Bound by Adve
9、rsary Strategy,Construct a input to force the algorithm to do more comparisons as possible, that is, to give away as few as possible units of new information with each comparison. It can be achieved that 2 units of new information are given away only when the status is N,N. It is always possible to
10、give adversary response for other status so that at most one new unit of information is given away, without any inconsistencies. So, the Lower Bound is n/2+n-2(for even n),An Example Using Adversary,Raising/lowering the value according to strategy,Now, x3 is the only one which never loses, so, Max i
11、s x3,Now, x4 is the only one which never wins, so, x4 is Min,8 Comparisons! The lower bound is 7.,Finding the Second-Largest Key,Using FindMax twice is a solution with 2n-3 comparisons. For a better algorithm, the idea is to collect some useful information from the first FindMax to decrease the numb
12、er of comparisons in the second FindMax. Useful information: the key which lost to a key other than max cannot be the second-Largest key. The worst case for twice FindMax is “No information”.(x1 is Max),Second Largest Key by Tournament,1,2,3,4,5,6,7,8,9,2,2,5,6,9,2,6,2,x2 is max Only x1, x3, x5, x6
13、may be the second largest key.,Larger key bubbles up,The length of the longest path is lgn , as many as those compared to max at most.,Analysis of Finding the Second,Any algorithm that finds secondLargest must also find max before. (n-1) The secondLargest can only be in those which lose directly to
14、max. On its path along which bubbling up to the root of tournament tree, max beat lgn keys at most. Pick up secondLargest. (lgn -1) n+ lgn-2,Lower Bound by Adversary,Theorem Any algorithm (that works by comparing keys) to find the second largest in a set of n keys must do at least n+lgn-2 comparison
15、s in the worst case. ProofThere is an adversary strategy that can force any algorithm that finds secondLargest to compare max to lgn distinct keys.,Assigning a weight w(x) to each key. The initial values are all 1. Adversary rules:,Weighted Key,Note: for one comparison, the weight increasing is no m
16、ore than doubled.,Zero=Loss,Lower Bound by Adversary: Details,Note: the sum of weights is always n. Let x is max, then x is the only nonzero weighted key, that is w(x)=n. By the adversary rules: wk(x) 2wk-1(x) Let K be the number of comparisons x wins against previously undefeated keys: n=wK(x)2Kw0(
17、x)=2K So, Klgn,Tracking the Losers to MAX,x1,x2,x3,x4,x5,x6,x7,x8,x9,x8,x10,x8,x8,x8,Building a heap structure of 2n-1 entries, using n-1 extra space,n entries in input,To be filled with winners,Finding the Median: the Strategy,Obervation: If we can partition the problem set of keys into 2 subsets:
18、S1, S2, such that any key in S1 is smaller that that of S2, then the median must located in the set with more elements. Divide-and-Conquer: only one subset is needed to be processed recursively.,Adjusting the Rank,The rank of the median (of the original set) in the subset considered can be evaluated
19、 easily. An example Let n=255 The rank of median we want is 128 Assuming |S1|=96, |S2|=159 Then, the original median is in S2, and the new rank is 128-96=32,Partitioning: Larger and Smaller,Dividing the array to be considered into two subsets: “small” and “large”, the one with more elements will be
20、processed recursively.,for any element in this segment, the key is less than pivot.,for any element in this segment, the key is not less than pivot.,A “bad” pivot will give a very uneven partition!,splitPoint:,pivot,small,large,To be,p,rocessed,recursively,Selection: the Algorithm,Input: S, a set of
21、 n keys; and k, an integer such that 1kn. Output: The kth smallest key in S. Note: Median selection is only a special case of the algorithm, with k=n/2. Procedure Element select(SetOfElements S, int k) if (|S|5) return direct solution; else Constructing the subsets S1 and S2; Processing one of S1,S2
22、 with more elements, recursively.,There is the same question with quicksort-imbalanced partition,Partition Improved: the Strategy,All the elements are put in groups of 5,Increasing,Medians,Increasing by medians,Constructing the Partition,Find the m*, the median of medians of all the groups of 5, as
23、illustrated previously. Compare each key in sections A and D to m*, and Let S1=Cx|xAD and xm* (m* is to be used as the pivot for the partition),Divide and Conquer,if (k=|S1|+1)return m*; else if (k|S1|)return select(S1,k); /recursion elsereturn select(S2,k-|S1|-1); /recursion,Counting the Number of
24、Comparisons,For simplicity: Assuming n=5(2r+1) for all calls of select.Note: r is about n/10, and 0.7n+2 is about 0.7n, so,Finding the median in every group of 5,Finding the median of the medians,Comparing all the elements in AD with m*,The extreme case: all the elements in AD in one subset.,Worst C
25、ase Complexity of Select,Note: Row sums is a decreasing geometric series, so W(n)(n),W(.23n),W(.22(.7)n),W(.22(.7)n),W(.2(.7)2n),W(.22(.7)n),W(.2(.7)2n),W(.2(.7)2n),W(.23n),W(.04n) 1.6(.04n),W(.14n) 1.6(.14n),W(.14n) 1.6(.14n),W(.49n) 1.6(.49n),W(.2n) 1.6(.2n),W(.7n) 1.6(.7n),W(n) 1.6n,1.6n,1.6(. 9)
26、n,1.6(. 81)n,1.6(. 9)3n,Relation to Median,Observation: Any algorithm of selection must know the relation of every element to the median.,Crucial Comparison,A crucial comparison establishes the relation of some x to the median. Definition (for a comparison involving a key x) Crucial comparison for x
27、: the first comparison where xy, for some ymedian, or xmedian and ymedian,Adversary for Lower Bound,Status of the key during the running of the Algorithm: L: Has been assigned a value larger than median S: Has been assigned a value smaller than median N: Has not yet been in a comparison Adversary ru
28、le:Comparands Adversarys actionN,N one L, the another SL,N or N,L change N to LS,N or N,S change N to S (In all other cases, just keep consistency),Notes on the Adversary Arguments,All actions explicitly specified above make the comparisons un-crucial. At least, (n-1)/2 L or S can be assigned freely
29、. If there are already (n-1)/2 S, a value larger than median must be assigned to the new key, and if there are already (n-1)/2 L, a value smaller than median must be assigned to the new key. The last assigned value is the median. So, an adversary can force the algorithm to do (n-1)/2 un-crucial comp
30、arisons at least(In the case that the algorithm start out by doing (n-1)/2 comparisons involving two N.,Lower Bound for Selection Problem,Theorem: Any algorithm to find the median of n keys(for odd n) by comparison of keys must do at least 3n/2-3/2 comparisons in the worst case. Argument: There must
31、 be done n-1 crucial comparisons at least. An adversary can force the algorithm to perform as many as (n-1)/2 uncrucial comparisons. (Note: the algorithm can always start out by doing (n-1)/2 comparisons involving 2 N-keys, so, only (n-1)/2 L or S left for the adversary to assign freely as the adversary rule.,Home Assignment,5.2 5.4 5.6 5.8 5.12-14 5.17,