1、1A fast RTL Power Estimator for Combinational Circuit1一种快速的组合电路 RTL 功耗估算器ZHAO Wen-qing(赵文庆 ), CUI Ming-dong(崔铭栋 ) and TANG Pu-shan(唐璞山)(CAD Lab, Electronic Engineering Department, Fudan University, 200433, Shanghai)复旦大学电子工程系 CAD 研究室,200433,上海Abstract:VLSI design is toward much higher level with the
2、development of modern synthesis tools. However, due to the computational complexity problem, gate level power estimators are becoming more and more inapplicable for high level modules. In order to estimate circuit power at the early design stage, RTL power analysis tools are needed. In this paper, w
3、e present a fast method to calculate RTL combinational module power. Once the power library is built, we can give the power dissipation of a certain module under stimulation of any input vector. Our method used Taylors expansion to establish an equation-based model, and Monte-Carlo simulation is use
4、d for library establishment. The result of ISCAS85 benchmark shows that the relative error of our method is within 5%.Key words: RTL, power analysis摘要:随着现代综合工具的发展,集成电路设计越来越趋向于更高的层次。门级的功耗模拟器由于在计算复杂度上存在的问题,对于高层次模块变得愈加不适用。为了设计初期能在高层次进行功耗估算,我们需要 RTL 的功耗模拟器。本文提出了一种快速分析组合 RTL 模块功耗的方法,经过建立模块功耗库,可以非常快的计算出任意
5、输入向量驱动的电路功耗。我们的方法使用泰勒一阶近似的公式模型,并在建库过程中采用 Monte-Carlo 模拟方法。ISCAS85 benchmark 电路模拟的结果显示,该方法的误差可以在 5以内。关键字:RTL,功耗分析1. IntroductionIt is typically the case that area, speed and reliability are always given more concern in traditional IC design process, however, much larger scale and much faster speed of
6、 modern electronic systems has led to a serious of power-related problems which are receiving more concern. Often general-purpose macros developed independently by third-party intellectual property (IP) providers are reused everywhere. In a power-constraint design (such as consumer electronic device
7、s), the power dissipation of high level modules are required to be predicted at early design phase. Thus, tools that allow designer to evaluate power budget during various design phases are in great demands. Research on gate-level power estimator has been on for quite a long time, and many technique
8、s have been proposed (paper 1 gives a survey). Practical tools are already in use now. These estimators can give precise power dissipation of a circuit driven by certain input vectors. However, due to the nature of its simulation process, gate-level estimators always have the slowest speed, and more
9、 critical disadvantage is that circuit netlist must be known before any simulation could be performed, this greatly blocked the advance of high-level design technology. Designers are no longer satisfied with such estimators, and a RTL estimator which will help them to correctly evaluate 1 This resea
10、rch is supported by National High Technology Research and Development 863 Plan 863-SOC-Y-3-3 NSFC overseas young scientist joint research project 69928402, the doctoral program foundation of Ministry of Education of China 2000024628, NSFC project 69806004, , foundation for university key teacher by
11、the Ministry of Education,本文工作受国家 863 计划 863SOCY261,863SOC Y33,国家自然科学基金海外杰出青年学者合作研究基金项目 69928402,国家自然科学基金项目 69806004,教育部高等学校博士学科点专项科研基金 2000024628 和教育部高等学校骨干教师资助计划资助2the power in RTL design phase is needed in order to choose a proper framework structure.In response to this need, researches on RTL po
12、wer estimation have been started as early as the 90s of the last century and a number of high-level estimation techniques have been recently proposed since 1995. Generally speaking. All these techniques could be categorized into two classes, one of them tries to establish a tabular or equation model
13、 based on the analysis of a scale of circuit, these methods take as input the characteristic of stimulation vectors which are the decisive factor of module power dissipation. For a given vector, first, its probability and correlation are calculated, then a power table is searched or a power equation
14、 is evaluated using these parameters to produce power dissipation. Another kind of method goes much deeper into a module, the signal transition status of the inner nodes are analyzed, then transformation methods are performed which give the final power estimation. A common procedure for both of the
15、two kinds of method are: 1) behavior level simulation to get vectors information as input parameter, 2) using specific models and these parameters to estimate power. The difference of the two kinds of method is that the first kind does not need to know inner structure of a module once the power libr
16、ary is established, while the second needs to know inner structure for a specific application.The method in 2 uses power factor approximation technique, which treats all the circuit input bits as digital white noise and the product of an effective coefficient and input vector feature is used to refl
17、ect power dissipation, this brings a relative error as much as 80%. Method 3 gives more accurate results by treating different modules differently, however it is not a general method since equations are assumed to be provided by users for different modules. In method 4, a corresponding power table w
18、ith four variables (average signal probability Pin, average signal transition density Din, spatial relativity SCin and temporal relativity TCin) as its dimension parameters is built first for each RTL module. Then the estimation process could be greatly simplified while good accuracy is still preser
19、ved. Its disadvantage is that to build a power library is considerably time-consuming. The inner-node-sampling method in 5 is quite different from the methods above,it focuses on STG (state transition graph) and takes part of inner node transition information as input parameters, a transformation eq
20、uation based on these parameters gives the final result. However in this method, detailed circuit netlist is required in advance which is not suitable for practical RTL power estimation.Method 6 and 7 are general equation model, they use LMS (least mean square) or some other fitting algorithms to de
21、cide coefficients in their models. The parameters are quite like that used in 4, and it has fast calculation speed, spends much less time to build library, and with less accuracy.A careful study of the relationship between input vectors and module power dissipation shows that the property of any inp
22、ut port should not be neglected. Based on this observation, we bring forward our equation model which used the parameter of every input port instead of using their average parameters. We discard any mathematical fitting algorithms since all of them are too time-consuming. Instead, we adopted our new
23、 specific simulation procedure to decide the coefficients in our model. In Section 2, we will give our analysis of the property of module power, then in Section 3 and Section 4, we will describe the specific simulation procedure which determines the model coefficients, and in Section 5 we will evalu
24、ate the accuracy of our model, together with time complexity comparison. The last Section is the conclusion.2. Preliminaries2.1 Analysis of RTL module power dissipationThe majority of current VLSI circuits are manufactured in CMOS process. Four main sources of power dissipation in CMOS circuits are:
25、 1) Node signal transition which causes charge and/or discharge of parasitic capacitance. 2) Glitch power which is due to the inessential transition of glitches. 3) Short circuit current,since any signal transition will last for some time, when a momentary current flows from VDD to GND. 4) Static le
26、akage current or sub-threshold current.Since 1), 2), 3) are associated with dynamic transition of a circuit, we call them dynamic power, while 4) is irrelevant to transition, we call it static power. For 2), it is extremely difficult to calculate, for it can only be observed using real-delay simulat
27、ors, and it will occupy as much as 20% of total dynamic power. Part 3) is determined by average signal slewrate. A proper equation(1) of dynamic 3power is as follows: )1(5.02miiavgefdynaicECVPin which V is operating voltage,C ieff is effective capacitance on node i.,E iavg is the average transition
28、probability of node i. Static power 4) is mainly determined by circuit scale and process which can be described by a function directly proportional to the circuit scale.Practically speaking, the RTL simulators problems are black-box problems, i.e. the inner structure of each module is transparent to
29、 users no matter the module is a hard module which contains low level gate structure and interconnection information or a soft module which contains only VHDL code. In the case of soft modules, there is no guarantee that the final implementation is exactly the same as the one in our power library, f
30、or different synthesis tools will generate different implementation. If we stick to the module in our library, errors will be inevitable. Fortunately, 9 has pointed that mismatches in final implementation produced by technology, library and synthesis tools tend to have limited variance, although the
31、ir absolute value can be significant. Library based on a certain technology can be easily modified to adapt other technologies, only a technology scaling parameter Stech is needed to make this modification. Technology tuning is performed once for all. We manage to build an inner-structure-irrelevant
32、 model, this model uses the same method to handle different kinds of RTL module for general purpose. The only information we can get to evaluate a system level design is module list and its interconnection. After behavioral level simulation, the input and output vectors of each module are known, so
33、a thorough study of vectors is essential to find its contribution to module power dissipation, this will be discussed later in this Section.2.2 Input vector propertyOften in RTL power analysis, average signal probability Pavg and average signal transition density Davg are used7. Take a circuit modul
34、e with m input ports as a example. For a single input port with inputs V=v1, , vn, vx equals to 0 or 1, we define signal probability Pi and signal transition density Di,: ).(11 defVVPjjinji Obviously, P is the proportion of logical high in n vector cycles, while D is the proportion of signal transit
35、ion in n-1 adjacent vector cycles. Then Pavg and Davg are defined as:)2.()1(1 defmDmnijjavgijavg the two parameters are not independent, the constraint between them as follows:)22PStatistically, P is a function of D, and the graph of this function is something like a reversed bell, so we focus on pa
36、rameter D and believe that the effect of P could be reflected indirectly. 2.3 Error evaluationTo evaluate the accuracy of our RTL model, we introduce two error factors: average relative error REavg and average total error TEavg. ,also we define max relative error Emax , )3.(1 1max1 defxMAXxETxEpR ii
37、ipipiiiavgiiiavg p is total count of simulation number, is the input vector in simulation, is the result of igate-level simulator and is the result of RTL simulator.42.4 Relation between power dissipation and input vectorsThe result of a large number of simulation with different input vectors shows
38、that it is signal transition density D that greatly affects power dissipation of a circuit, so it will definitely be the decisive factor in our equation based model. 1) As for a definite circuit, its power dissipation is proportional to the average hamming distance of input vectors. As illustrated i
39、n FIG 1(a), each point is the simulation result of a vector sequence whose length is 100, and the transition probability over each input port is evenly distributed. Obviously, the relationship tend to be linear, very few points deviate from this line which shows that the error may be small if we use
40、 linear function to describe this relationship.2) However, things are not so simple, as illustrated in FIG 1(b), if we unevenly distribute the transition probability over different input port, the power dissipation falls away from the linear line. Power value is not always the same even if the avera
41、ge input hamming distance is the same. So linear model is quite inadequate to describe this relationship. There must be some other parameters besides average hamming distance that affect module power.3) We found that the contribution of signal transition of different input port is also different. Th
42、is is because the number of directly and indirectly driven circuit node from a certain input port varies greatly. As illustrated in FIG 2,the horizontal axis is input port number, and the vertical axis is unitary power dissipation. In order to get the unitary power dissipation value, we made a 0/1 a
43、nd 1/0 transition over certain input port while maintain some fixed random value over the other port, and we registered the total transition of all circuit nodes. An average value is obtained by repeating this procedure for 100 times. Obviously, the contribution brought by various ports also varies
44、greatly.3. Equation-based RTL power modelWe believe it inadequate only to consider average transition density. If the information of input port contribution is totally neglected, any compensation method later will not make up for it by using signal spatial or temporal relativity. The power model equ
45、ation is a complicated equation of each Di, )3(,.21_ndynamicrtl DfPSo we define power contribution factor which reveals the relationship between single input port and power dissipation, )4(li0iiDPowerWhere Di denotes the signal transition density of the i-th input port, our model is the 1st-order ap
46、proximation of the complicated power plane of f using Taylors expansion. )5(.()!21)0.,(_ niiiidynamicrtl DDerf where is general power factor,n is input port number, i the unitary contribution of the ith input port,D i is the transition density of the ith input port. Here because a vector 0).,fFIG 1
47、Transition vs Hamming distance(a) (b)FIG 2 Different contribution5sequence without any transition will not cause any power dissipation. The first part of this model is dynamic power, which is proportional to the summation of all the contribution of each input port. The second part is static power wh
48、ich is proportional to number of gates in this module.For static power, since it is almost proportional to circuit size, we have:)6(_mPstaicrlwhere is static factor and m is the total node number of the circuit.4. Procedure of our algorithm4.1 Coefficients of dynamic partIn stead of the traditional
49、fitting algorithm (as in 678), we developed a new training-vectors-free method which has relative faster speed to calculate model parameters. In our method, a transformation procedure is used to turn some gate-level simulation results into model parameters, which are as follows: a) For a definite input port i of a circuit, set its signal transition density as 100%, i.e. each time the corresponding signal will change its status.b) For the other input ports, set random vectors.