1、MIT CLRS NOTES苑炜弢整理Lecture 1Analysis of algorithmstheoretical study of computer programming performancemaking things run fast. speed is fun. Problem: sortinginput: sequence of numbersoutput: permutation Insertions sort (A, n) we use pseudocode Running time:Depends on input(eg , already sorted)Depend
2、s on input size( 6 elems vs 6X10000000)parameterize in input sizeWant upper boundguarantee to userKinds of analysisworst case (usually)T(n)= max time on any input at size n (T(n) is a function)if delete max, T(n) will be a relation not a functionAverage case (sometimes)T(n)= expected time over all i
3、nputs of size n(need assumption of probability distribution) Best case (bogus)cheat with slow alg that work fast on some inputfor insertion sort, what is running time?depends on computerrelative speed (on same machine) run two prog. on the same machineabsolute speed (on diff machines)Insertion sort
4、analysisworst case : input reverse sortedTime= nj2)(Big Idea: Asymptotic Analysis (huge idea!)Ignore machine dependent constantslook at growth of T(n) as nAsymptotic analysis major tool notation: drop low order termsignore leading constants)(60959332nnnthe constants will not influent the growth of n
5、. if you compare the order of growth, after some point the slower growth alg will beat the faster growth alg. The constants will matter for the beating point of two diff growth alg, or when two alg have the same growth rate. when we have the input size of 300 or others , we should take the constant
6、into consideration because of the practical situation.Insertion sort analysis:worst case : T(n)= = arith analysisnj2)(2this is upper bound of worst running time. the best running time will be . )(nthe growth for these situations are different. the algorithm running time will vary (not a function) bu
7、t the worst running is a function.)(naverage case: T(n)= =nj2)/(2it is a good sorting alg for small n , but not good for large n.Merge Sort A1n1. If n=1, done2. Resursively sort A 1 and A +1 n2/n2/n3. “Merge” 2 sorted sets.T(n)= (1)abuse, constant time+2T(n/2)+ (n)T(n)= 2T(n/2)+ (n)= (n lgn)better i
8、n large n than insert sorting Lecture 2:解递归式(Solving recurrences )替换法 Substitution method (most general, can be applied to anything)步骤:1) 猜测解的形式(guess of form of solution)2) 通过数学归纳法,验证猜测结果 verify the guess by induction3) 找出使解真正有效的常数(Solve for constants)举例:如对 T(n)=4T(n/2)+n 【T(1) = 】)1(1)猜测:T(n)= )(3
9、n用归纳法证明。(1)假设:对 K=0 因此,我们可以选择合适的 C,使得 residual = 0(如 c = 2, n0 = 1)(3)目前,我们丢掉了基本情况,因此需要将归纳法建立到基本情况上。 (Need to ground induction with base case)初始条件: T(n)= 对 n=0 因此选 C2=1 and the base case will make the value of C1, C1is constrained by initial conditions.here we learn that the leading term of this alg
10、 is determined by the initial conditions, so if you make the initial condition smaller(when n=1, 2 ,3.) , we can improve the algorithms. This is what we learned from solving recurrence problem.【以上均是证明大 O。如果要证明 theta,则需要证明大 O 和大 mega】递归树方法(Recursion Trees)替换法是一种简单的证明方法,但要有更准确的猜测,有时很难。则,可以画出一个递归树来得到好的
11、猜测。 递归树是对一个算法的递归执行过程的代价(时间)进行建模 递归树中,每一个结点都代表递归函数调用集合中一个子问题的代价。将递归树中每一层内的代价相加得到一个每层代价的集合,再将每层的代价相加得到递归式所有层次的总代价举例:T(n)=T(n/4)+T(n/2)+n 2 T(n) = = T= n2.n2/16+n2/4 = 5n2/16 56 21409n )1()()1(T(n) = n2(1+ + + + ) = 几何级数 56235162n主方法(Master Method)主方法应用到如下的递归式形式:T(n)=a T(n/b)+f(n),其中 ,1ab1,f 渐进正的函数(存在
12、n0,f(n)0 for ) 。是递归树的一个应用,0n但更精确。1. 主定理的三种情况:(都与函数 进行比较)logbaCase 1: 对某常数 = log()bafnO0log()baTnT(n/4) T(n/2)n2(n/4)2 T(n/2)2n2T(n/16) T(n/8) T(n/8) T(n/4)(n/4)2 (n/2)2n2(n/16)2 (n/8)2 (n/8)2 (n/4)2解释:f(n)不仅小于 ,而且是多项式地小于。logbanCase 2: 对某常数 = log()bakfn0klog1()bakTnnCase 3: 对某常数 ,且对常数 c (/)afbcf()Tnf
13、解释:f(n)不仅要大于 ,而且是多项式地大于,还要满足“规logba则性”条件。 “规则性”条件:确保 f(n)不断变小。2. 举例:1)T(n) = 4T(n/2) + na = 4, b = 2 = n2;f(n) = nlogbaCase 1: f(n) = 2)T( n) = 4T(n/2) + n2a =4, b= 2 = n2; f(n) = n2.logbaCase 2: ,即k = 0,因此0()lf 2()lg)Tn3)T( n) = 4T(n/2) + n3a =4, b= 2 = n2; f(n) = n3.logbaCase 3: f(n) = (n2+ ) for
14、= 1且4( n/2)3cn3 (reg. cond.) for c= 1/2.4)T( n) = 4T(n/2) + n2/lgna =4, b= 2nlogba=n2; f(n) = n2/lgn.Master method does not apply. In particular, for every constant 0, we have n= (lgn).2. 主方法的证明Case 1: The weight increases geometrically from the root to the leaves. The leaves hold a constant fractio
15、n of the total weight. ( )log()banCASE2: (k= 0) The weight is approximately the same on each of the logbn levels. log)banCASE3: The weight decreases geometrically from the root to the leaves. The root holds a constant fraction of the total weight. ( )()fnLecture 3 分治法( Divide and Conquer)Divide and
16、Conquer paradigm1. Divide problem(instance) to subproblem2. Conquer subproblems recursively3. Combine subproblems solutions.Merge sort1. Divide: Trivial2. Conquer: 1 n/2 n/2+1 nrecursively sort 2 subarrays3. Combine: Linear time merge of 2 sorted sequencesT(n)=2T(n/2)+ )(n# subprobl. size of each su
17、bprobl. cost of div. & comb.T(n)= )lg(Binary Search on a sorted array1. Divide: compare with the middle element.2. Conquer: Search in 1 subarray3. Combine: Trivialex: 3 5 7 8 9 12 15 find 9 first third secT(n)=T(n/2)+ )1(T(n)= )(lgnPowering a numberProblem: Compute an, where n N.Nave algorithm: an =
18、a*a*a a T(n)= )(nDivide-and-conquer algorithm:/2/(1)()/2 ifas evn odnnathis is divide and conquer. Recursively squaringT(n)=T(n/2)+ )1(T(n)= (lgnFibonacci numbersFn= 0 if n=01 if n=1Fn-1+Fn-2 if n=2algorithmsnaive recursive: exponential time where )(n251known Fn= rounded to nearest integer5/recursiv
19、e squaring fast but wrong round offbottom-up algorithms (reuse #s)T(n)= )(nfaster algo for Fibonacci numbersTheorem we can prove it by induction101nnFrecursive squaring on T(n)=0)(lgnMatrix Multiplicationnkjkiji bac1,standard alg:for i 1 to ndo for j1 to ndo cij0for k1 to ndo cijc ij+ aikbkjT(n)= )(
20、3nWe will use divide and conquer alg. Idea: curve n*n matrices into 2*2 matrix at n/2*n/2 submatrixT(n)=8T(n/2)+ T(n)= )(2n)(3nStrassens algorithmsIdea: Multiply 2*2 matrices with only 7 recursive multi.P1= a(fh) P2= (a+ b) h P3= (c+ d) eP4= d(ge) P5= (a+ d) (e+ h) P6= (bd) (g+ h)P7= (ac) (e+ f )r=P
21、5+ P4P2+ P6 s=P1+ P2t=P3+ P4 u=P5+ P1P3P77mults, 18adds/subs.Note:No reliance on commutativityof mult!1. Divide : partition A and B into (n/2)(n/2) submat. Form terms to be multiplied using + and 2. Conquer: Perform 7 multiplications of (n/2)(n/2) submatrices recursively3. Combine: Form C using + an
22、d on (n/2)(n/2) submatrices.Analysis of Strassen: T(n)=7T(n/2)+ )(2nThe number 2.81may not seem much smaller than 3, but because the difference is in the exponent, the impact on running time is significant. In fact, Strassensalgorithm beats the ordinary algorithm on todays machines for n32or so.Best
23、 to date (of theoretical interest only): 2.376()nConclusionDivide and conquer is just one of several powerful techniques for algorithm design. Divide-and-conquer algorithms can be analyzed using recurrences and the master method (so practice this math).The divide-and-conquer strategy often leads to
24、efficient algorithms.Lecture 4Quicksort (Hoare 1962)Divide and ConquerSorts “in the place”Very practical ( with tuning)Divide and conquer1. Divide: Partition array into 2 subarrays around “pivot” x such that element in lower subarray =x2. Conquer: recursively sorting 2 subarrays3. Combination: Trivi
25、alKey: Linear time partitioning subroutinePartition(A, p, q) / ApqAnalysis assume all elems distinctT(n)= worst-case timeinput sorted or reverse sortedone side of partition has no elems T(n)=T(0)const, can be dropped+T(n-1)+ )(nT(n)= (arith series)(2nBest-case analysis (intuition only)If we are real
26、ly, really, really lucky, partition splits array n/2 * n/2T(n)=2T(n/2) + )(nT(n)= lg(Sup. split is 9/10 * 1/10 T(n)=T(n/10)+ T(n*9/10)+ )(nRecursion tree: .T(n)0base case choose a large enoughUse fact 122281lglnk nSubstitution:ET(n) )(lg21nkank )(81lg2(2nna4lif a is big enough so that an/4 dominates
27、 g )(nLecture 5How fast can we sort ?comparison sorts: only use comparisons to determine order among elemente.g. insertion sort(n2), merge sort(nlgn), quicksort(n2 , nlgn exp ), heap sort(nlgn)Decision trees sort Ex P166Each internal node is labeled i:j-left subtree gives subseq. comparisons if ai i
28、ndicating that )(,.2)1(n )()2()1(.naA decision tree can model any comparison sort.- one tree for each n- view alg. as splitting the tree comparison- tree represents all possible execution traces.- running time= length of root-leaf path- worst-case running time= height of treeTheorem: Any decision tr
29、ee that sorts n number has height at least )lg(nProof: correctness will tell us#leaves = # permutations on n numbers = n!we use because maybe the real algorithms will do the same comparison two times therefore more leaves than the permutation#leaves=lg(n!) because lg is mono, increasinglg(n!)=lg(n/e
30、)n) because stirling formular=nlgn-nlgeh= )lg(nCorollary: Heap sort and merge sort are optimal comparison sortsCounting sort no element comparisonsInput: A1n where each Ai k.1Output: B1n sortedAux space: C1kP168 source code)(knImportant property of counting sort:Stability: preserve the order of equa
31、l elementsRadix sortdigit-by-digit original card-sorting machineoriginal methed:sort by the most sig. digit firstRadix sort: sort by least significant digit first with aux. stable sort (typical counting sort).eg: P171Correctness: induct on digit t.assume that set is sorted by last-sig. t-1 digit.- t
32、wo elements that differ on t-th digit. are part in right order- two elem that are the same on t-th digit. are left in the original order sorted by t least-sig digitsAnalysis( with counting sort )Sort n words of b bits each split into equal pieces of r bits each. b/r passes of counting sort with krun
33、ning time T(n,b)= )2(/rnblet r=lg n then T(n,b)= lgeg: numbers - T(n)=1,.0d)(dd lgnlgn so d passesif you have big numbers, you can use it.Lecture 6Order StatisticsSelect ith smallest number in n numbers.(element with rank i)i=1: minimumi=n: maximumi= or : median2/n/Naive alg: Sort and return the ith
34、 elemTime: in worst case)lg()1lg(nnRandomized divide-and-conquer algP186 source(we use q for the end of array instead of r)=Axp r qEx: 6 10 13 5 8 3 2 11 select 7th smallest 6 pivot 2 5 3 6 8 13 10 11 K=4 recursively select 7-4=3 rd smallestIntuition for analysis(today Assume all elem are distinct)R
35、.R. Lucky: T(n)= )(nLucky: T(n)=T(9n/10)+ T(n) = )(R.R. Unlucky: T(n)=T(n-1)+ arithmetic series)(nT(n) = )(2nAnalysis of expected timeT(n)=random var. for running time on n numbers. assuming random numbers are indepent.For k=0,1,n-1 define indicator rand. var.Xk= 1 if partition generates k:(n-k-1) s
36、plit0 elseTo obtain an upper bound, assume alg. recurses on larger of 2 subproblemT(n) = T(max(0,n-1)+ if X0=1)(nT(max(1,n-2)+ if X1=1T(max(2,n-3)+ if X2=1)(.T(max(n-1,0)+ if Xn-1=1)(nT(n) )(n 1,(max10 nkknTXET(n) 10 k10(n)X ),(axnk nkk EkTE 10 k10 (),(mnk nkn10 (),(axnkkTE12/(n)nkClaim: ET(n) Cn for const. C0Fact: 12/283nkProof: Assume by induc. ET(k) ck for k at least elements X103nsimilarly at least elements XSmall simplification: if n 50 then 103n4T(n)= for n50)1(T(n) T(n/5)+T(3n/4)+)(nClaim: T(n) CnT(n) Cn/5+C3n/4+ )(=Cn-(Cn/20- )nCn if c suff. large