1、Union-Find,Algorithm : Design & Analysis 11,In the last class,Hashing Collision Handling for Hashing Closed Address Hashing Open Address Hashing Hash Functions,Union-Find,Dynamic Equivalence Relation Implementing Dynamic Set by Union-Find Straight Union-Find Making Shorter Tree by Weighted Union Com
2、pressing Path by Compressing-Find Amortized Analysis of wUnion-cFind,Maze Creating: an Example,Selecting a wall to pull down randomly,Inlet,Outlet,i,j,If i,j are in same equivalence class, then select another wall to pull down, otherwise, joint the two classes into one. The maze is complete when the
3、 inlet and outlet are in one equivalence class.,Dynamic Equivalence Relations,Equivalence reflexive, symmetric, transitive equivalent classes forming a partition Dynamic equivalence relation changing in the process of computation IS instruction: yes or no (in the same equivalence class) MAKE instruc
4、tion: combining two equivalent classes, by relating two unrelated elements, and influencing the results of subsequent IS instructions. Starting as equality relation,Union-Find ADT,A object of type Union-Find is a collection of disjoint sets There is no way to traverse through all the elements in one
5、 set,Union-Find ADT,Constructor: Union-Find create(int n) sets=create(n) refers to a newly created group of sets 1, 2, ., n (n singletons) Access Function: int find(UnionFind sets, e) find(sets, e)=e Manipulation Procedures void makeSet(UnionFind sets, int e) void union(UnionFind sets, int s, int t)
6、,Implementation by Union-Find,IS si sj : t=find(si); u=find(sj); (t=u)? MAKE si sj : t=find(si); u=find(sj); union(t,u);,implementation by inTree,0,1,i,n-1,n,create(n): sequence of makeNode,sj,u,find(sj)=u,parentk(sj),union(t,u),u,t,setParent(t,u),Union-Find Program,A union-find program of length m
7、is (a create(n) operation followed by) a sequence of m union and/or find operations interspersed in any order. A union-find program is considered an input, the object for which the analysis is conducted. The measure: number of accesses to the parent assignments: for union operations lookups: for fin
8、d operations,operations done: n+(n-1)+(m-n+1)n,Worst-case Analysis for Union-Find Program,Assuming each lookup/assignment take (1). Each makeSet or union does one assignment, and each find does d+1 lookups, where d is the depth of the node.,Union(1,2) Union(2,3)n-1. Union(n-1,n) Find(1)Find(1)Exampl
9、e,The sequence of Union makes a chain of length n-1, which is the tree with the largest height,Find(1) needs n array lookups, (mn),Weighted Union: for Short Trees,Weighted union: always have the tree with fewer nodes as subtree. (wUnion),To keep the Union valid, each Union operation is replaced by:
10、t=find(i); u=find(j); union(t,u),The order of (t,u) satisfying the requirement,2,3,1,n-1,n,Tree made by wUnion,Not the worst case!,Cost for the program:n+3(n-1)+2(m-n+1),After any sequence of Union instructions, implemented by wUnion, any tree that has k nodes will have height at most lgk Proof by i
11、nduction on k: base case: k=1, the height is 0. by inductive hypothesis: h1 lgk1, h2 lgk2 h=max(h1, h2+1), k=k1+k2 if h=h1, h lgk1 lgk if h=h2+1, note: k2k/2so, h2+1 lgk2+1 lgk,Upper Bound of Tree Height,T1 k1 nodes height h1,T2 k2 nodes height h2,t,u,T k nodes height h,Upper Bound for Union-Find Pr
12、ogram,A Union-Find program of size m, on a set of n elements, performs (n+mlogn) link operations in the worst case if wUnion and straight find are used. Proof: At most n-1 wUnion can be done, building a tree with height at most lgn, Then, each find costs at most lgn+1. Each wUnion costs in (1), so,
13、the upper bound on the cost of any combination of m wUnion/find operations is the cost of m find operations, that is m(lgn+1) (n+mlogn) There do exist programs requiring (n+mlogn) steps.,Decreasing Complexity by Path Compression,x,w,v,v,w,x,Path compressed,cFind does twice as many link operations as
14、 the find does for a given node in a given tree.,Co-Strength of wUnion and cFind,The number of link operations done by a Union-Find program implemented with wUnion and cFind, of length m on a set of n elements is in O(n+m)lg*(n) in the worst case.,Whats lg*(n)? Define the function H as following:The
15、n, lg*(j) for j1 is defined as: lg*(j)=min k|H(k)j ,lg*(n) grows extremely slowly: in o(log(p)n),Definitions with a Union-Find Program P,Forest F: the forest constructed by the sequence of union instructions in P, assuming: wUnion is used; the finds in the P are ignored Height of a node v in any tre
16、e: the height of the subtree rooted at v Rank of v: the height of v in F,Note: cFind changes the height of a node.,Constraints on Ranks in F,The upper bound of the number of nodes with rank r (r0) is Remember that the height of the tree built by wUnion is at most lgn, which means the subtree of heig
17、ht r has at least 2r nodes. The subtrees with root at rank r are disjoint. The upper bound for rank is lgn There are altogether n elements in S, that is, n nodes in F.,Increasing Sequence of Ranks,The ranks of the nodes on a path from a leaf to a root of a tree in F form a strictly increasing sequen
18、ce. When a cFind operation changes the parent of a node, the new parent has higher rank than the old parent of that node. Note: the new parent was an ancestor of the previous parent.,A Function Growing Extremely Slowly,Function H: H(0)=1 H(i+1)=2H (i)that is: H(k)=2Note: H grows extremely fast:H(4)=
19、216=65536H(5)=265536,Function Log-star lg*(j) is defined as the least i such that:H(i)j for j0 Log-star grows extremely slowlyp is any fixed nonnegative constant,2,2,2,k 2s,Grouping Nodes by Ranks,Node vSi (i0) iff. lg*(1+rank of v)=i Upper bound of the number of distinct node groups is lg*(n+1) The
20、 rank of any node in F is at most lgn, so the largest group index is lg*(1+ lgn)=lg*(lgn+1) = lg*(n+1)-1,If lg*(n+1)=k, then,2,2,2,(k-1) 2s,lg(n+1),Log.,Amortized Cost of Union-Find,Amortized Equation RecalledThe operations to be considered: n makeSets m union & find (with at most n-1 unions),amorti
21、zed cost = actual cost + accounting cost,Accounting Cost for cFind,v=w0,Root=wk,wi,wi-1,wk-1,Only when k=0,1, there is no parent change,Group Boundary,For one cFind operation, the actual cost is 2k Not 2(k+1),Accounting cost is -2. But, no negative accounting cost is assigned to nodes of which its p
22、arent is in a different(higher) gruop,Amortizing Scheme for wUnion-cFind,makeSet Accounting cost is 4lg*(n+1) So, the amortized cost is 1+4lg*(n+1) wUnion Accounting cost is 0 So the amortized cost is 1 cFind Accounting cost is describes as in the previous page. Amortized cost 2k-2(k-1)-(lg*(n+1)-1)
23、=2lg*(n+1) (Compare with the worst case cost of cFind, 2lgn),Validation of the Amortizing Scheme,We must be assure that the sum of the accounting costs is never negative. The sum of the negative charges, incurred by cFind, does not exceed 4nlg*(n+1) We prove this by showing that at most 2nlg*(n+1) n
24、odes are influenced by the negative amortized cost.,Key Idea in the Derivation,When a cFind changes the parent of a node, the new parent is always has higher rank than the old parent. Once a node is assigned a new parent in a higher group, no more negative amortized cost will incurred for it again.
25、The number of different ranks is limited within a group.,Derivation,The number of withdrawals for all wS is:,The Conclusion,The number of link operations done by a Union-Find program implemented with wUnion and cFind, of length m on a set of n elements is in O(n+m)lg*(n) in the worst case. Note: since the sum of accounting cost is never negative, the actual cost is always not less than amortized cost. And, the upper bound of amortized cost is: (n+m)(1+4lg*(n+1),Home Assignments,6.19 6.21 6.23 6.25-27,