收藏 分享(赏)

第四章:静态功耗优化技术.ppt

上传人:gsy285395 文档编号:5584855 上传时间:2019-03-09 格式:PPT 页数:97 大小:3.39MB
下载 相关 举报
第四章:静态功耗优化技术.ppt_第1页
第1页 / 共97页
第四章:静态功耗优化技术.ppt_第2页
第2页 / 共97页
第四章:静态功耗优化技术.ppt_第3页
第3页 / 共97页
第四章:静态功耗优化技术.ppt_第4页
第4页 / 共97页
第四章:静态功耗优化技术.ppt_第5页
第5页 / 共97页
点击查看更多>>
资源描述

1、前次课回顾,Clock Gating Operand Isolation 逻辑级功耗优化 工艺映射, Logic Restructuring 电路级功耗优化,第四章 多电压域设计技术 (Multi-Voltage Domain),4.1 多电压域设计VLSI发展的一个重要趋势是SOC工艺的进步使SOC成为可能;设计复杂度的提高需要新的设计方法SOC中各部分性能要求不尽相同,可工作在不同电压下,性能要求高的工作的高电压域,反之。同一部分根据其工作负荷也可工作在不同电压,工作电压可以有不同变化方式 Static Voltage Scaling (SVS): different blocks or

2、subsystems are given different, fixed supply voltages.(最简单的多电压域设计) Multi-level Voltage Scaling (MVS): an extension of the static voltage scaling case where a block or subsystem is switched between two or more voltage levels. Only a few, fixed, discrete levels are supported for different operating mo

3、des., Dynamic Voltage and Frequency Scaling (DVFS): an extension of MVS where a larger number of voltage levels are dynamically switched to follow changing workloads. Adaptive Voltage Scaling (AVS): an extension of DVFS where a control loop is used to adjust the voltage.,就是最简单的 multi-voltage设计(SVS)也

4、给设计增加了难度 Level shifters. Signals that go between blocks that use different power rails often require level shifters Characterization and STA. With a single supply for the entire chip, timing analysis can be done at a single performance point. The libraries are characterized for this point, and the t

5、ools perform the analysis in a straight-forward manner. With multiple blocks running at different voltages, and with libraries that may not be characterized at the exact voltage we are using, timing analysis becomes much more complex, Floor planning, power planning, grids. Multiple power domains req

6、uire more careful and detailed floorplanning. The power grids become more complex. Board level issues. Multi-voltage designs require additional resources on the board additional regulators to provide the additional supplies. Power up and power down sequencing. There may be a required sequence for po

7、wering up the design in order to avoid deadlock., 高电压电源推低电压单元一般不会有问题但时序参数不准,因库单元的时序参数是针对同电位的驱动和接收电路的,驱动端过驱动的时序最好专门单元,4.2 Level Shifter, 低推高时会出现P、N管同时导通,必须用Level Shifter Such “up-shifting” level converters require two supply rails and typically share a common ground. The well structures cannot be joi

8、ned together but must be associated with the supplies independently.,高推低时,Level Shifter使用低电压,故一般放在低电压域,因其使用低电压,4.3 Level Shifter Placement,两电压域的模块距离较远时 ,可插入Buffer,BUFFER使用高电压,低推高时,Level Shifter一般放在高电压域,但因其使用高、低两个电压 ,低压要象信号线一样连出 Since the output driver requires more current than the input stage, we p

9、lace the level shifter in the 1.2V domain. placing the level shifters in the destination domain,Clock Skew 静态时序分析:都要针对多电压域进行,4.4 多电压设计时的时序问题,第五章 漏电流控制技术,The Power Crisis from Intel,Leakage Power is catching up with the active power in nano-scaled CMOS circuits.,The Power Crisis from IBM,低压设计的问题:漏电流

10、为什么要低电压设计? 小尺寸器件的要求 漏端热载流子退化临界电场 Em0.2MV/cm 0.35um击穿电压6V左右 一般要求工作电压为击穿电压的1/31/2 一般按恒定电场Scale Down 低功耗设计的要求,5.1 Low-Voltage Low-Threshold-Voltage circuit Design,PN Reverse-Bias Current (I1) Weak Inversion (I2)(亚域电流) Gate-Induced Drain Leakage (I3) Gate Oxide Tunneling (I4),5.2 漏流主要来源,Power GatingStac

11、ked CMOSDual(Multi) Threshold CMOS,5.3 Leakage Control Techniques,一、Power Gating技术 1、Power-Gating 与Clock-Gating,Clock Gating,Power Gating,Clock-Gating 只关断时钟,节省动态功耗,静态功耗不变。Power-Gating是关断电源,动态、静态功耗都不存在(还存在开关管的漏电),2、Power Gating 的适用性、问题及解决的途径概述 power gating is more invasive than clock-gating in that i

12、t affects inter-block interface communication and adds significant time delays to safely enter and exit power gated modes. Shutting down power to a block of logic may be scheduled explicitly by control software as part of device drivers or operating system idle tasks. Alternatively it may be initiat

13、ed in hardware by timers or system level power management controllers.,In any event, we are faced with architectural trade-offs between the amount of leakage power savings that is possible the entry and exit time penalties incurred the energy dissipated entering and leaving such leakage saving modes

14、 the activity profile (proportion and frequency of times asleep or active),A cached CPU subsystem can typically be dormant or inactive for long periods, making power gating attractive. But there are some trade-offs that must be considered: Power gating the entire CPU provides very good leakage power

15、 reduction. But wake-up-time response to an interrupt has significant system level design implications. If the cache contents are lost every time the CPU is powered down then there is likely to be a significant time and energy cost in all the bus activity to refill the cache when it is powered up. T

16、he net energy savings depend on the sleep/wake activity profile as to how much energy was saved when power gated versus the energy spent in reloading state.,A peripheral subsystem may have a much better defined profile than a CPU. But there are still some trade-offs. In particular, it may be necessa

17、ry to restore state quickly on wake-up to maximize power savings: The device driver may be required to explicitly load/restore key state or initiate hardware sequencer control as part of the sleep/wakeup sequence, but this places a significant burden on software. A better approach may be for the per

18、ipheral to store key state internally during sleep mode, but this requires special circuitry and additional control.,multi-processor CPU cluster where one or more processors may be power gated off completely. In this case we assume that a processor is powered down only when it has completed a task a

19、nd is idle, waiting for another task to be assigned: Power gating individual CPUs provides very good leakage power reduction. Because the CPU has completed its task, the fact that the local cache contents are lost when it is power gated is not a problem. The CPU is awoken clean and reset ready to ex

20、ecute and cache the next task it is given. Optimized energy savings may well require adaptive shutdown algorithms that vary the number of CPU cores power gated and active with varying workload.,3、Power Gating的实现,externally switched power supply:长期闲置 Internal power gating:短期闲置,Power Gating 的设计涉及到,The

21、 critical issues in power gating include : switching network and the power gating controller. isolation cells. retention flops,开关网络的设计: 开关网络的设计应避免多层次Power Gating(避免IR Drop增大),开关网络可以是“header” switch;也可以是“footer” switch或Both(IR Drop大,代价大),一般用其一。大多用“header”,With a header-style switch fabric, the intern

22、al nodes and outputs of a power gated block collapse down towards the ground rail when the switch is turned off. With a footer-style switch fabric the internal nodes and outputs all charge towards the supply rail when the switch is turned off. Note that here is no guarantee that the power gated node

23、s will ever fully discharge to ground or fully charge to the supply. Instead, an equilibrium is reached when the leakage current through the switches is balanced by the sub-threshold leakage of the switched cells. This is one of the reasons why isolation cells are required on outputs of power gated

24、blocks,Switch Vdd or Vss rather then booth, in order to minimize the IR drop.Decide early on in the design phase whether header or footer switches most naturally fit with the system design.Header switches may be the most appropriate choice for switches if external power gating will also be used on t

25、he chip. Header switches may be the most appropriate choice for switches if multiple power rails and/or voltage scaling will be used on the chip.(共地),A key concern in controlling the switching fabric is to limit the in-rush current when power to the block is switched on. Excessive in-rush current ca

26、n cause voltage spikes on the supply, possibly corrupting registers in the always-on blocks, as well as retention registers in the power gated block One representative approach is to daisy-chain the control signal to the switches. The control signal from the power controller is connected to the firs

27、t switch, and it buffers (with an appropriate delay) the signal and sends it on to the next switch. Turning on the switching fabric is to use several power-up control signals in sequence. The first control signal may turn on a set of weak or “trickle” switches, which initiate the power up but limit

28、the in-rush current. The second control signal may then turn on the main set of power switches.,These switches have multiple enable pins; typically, the smaller switch is turned on first to get the voltage up to 95 percent, then the bigger switch is turned on to reduce the IR drop.,Power-Gating 的粒度(

29、粗粒度) In coarse grain power gating, a block of gates has its power switched by a collection of switch cells. The sizing of a coarse grain switch network is more difficult than a fine grain switch as the exact switching activity of the logic it supplies is not known and can only be estimated. But coar

30、se grain gating designs have significantly less area penalty than fine grain. 多数应用:粗粒度,Power-Gating 的粒度(细粒度) In fine grain power gating the switch is placed locally inside each standard cell. Since this switch must supply the worst case current required by the cell, it has to be quite large in order

31、 not to impact performance. The area overhead of each cell is significant (often 2x-4x the size of the original cell). The key advantage of fine grain power gating is that the timing impact of the IR drop across the switch and the behavior of the clamp are easy to characterize as they are contained

32、within the cell. This means that it is still possible to use a traditional design flow to deploy fine grain power gating,Isolation Cell 的设计,模块被Power Down后输出浮空,电平未知,被其驱动的负载可能处于P、N管都通的情况 需加信号隔离,输出固定值 隔离单元输出一般为被驱动的无效态,增加隔离单元增加了延迟 选用上下拉做隔离可消除延迟 但隔离信号处于多源驱动,Power-gating controler 设计时必须注意,避免出现多源竞争,用OR还是AN

33、D要看被驱动电路输入是高有效还是低有效。若某输入为低时发出中断,在需用OR将其启动为高 因驱动负载可能是多个,故信号隔离单元一般亦放在驱动端,Signal Isolation应满足一定时序要求,State Retention 设计,Given a power switching fabric and an isolation strategy, it is possible to power gate a block of logic. But unless a retention strategy is employed, all state information is lost when

34、 the block is powered down. To resume its operation on power up, the block must either have its state restored from an external source or build up its state from the reset condition. In either case, the time and power required can be significant.,Retention registers typically have an auxiliary or sh

35、adow register that is slower than the main register but which has much less leakage current. The shadow register is always powered up, and stores the contents of the main register during power gating These retention registers need to be told when to store the current contents of the main register in

36、to the shadow register and when to restore the value back to the main register. This control is provided by the power gating controller.,State Retention and Restoration Methods,A software approach based on reading and writing registers A scan-based approach based on using scan chains to store state

37、off chip A register-based approach that uses retention registers,State Retention 时序,SRPG不能断电(VRET) SRPG速度可以慢,漏流要小,实现Power-Gating应解决如下问题 Design of the power switching fabric Design of the power gating controller Selection and use of retention registers and isolation cells Minimizing the impact of pow

38、er gating on timing and area. The functional control of clocks and resets Interface isolation,实现Power-Gating应解决如下问题 Developing the correct constraints for implementation and analysis Performing state-dependent verification for each supported power state Performing power state transition verification

39、 to ensure all legal state entry and exit arcs are simulated and verified Developing a strategy for manufacturing and production test,一个完整的Power Gating 的设计,switching network isolation cells retention flops power gating controller Level Shifter(看是否是多电压域),Power Cycle Sequence,For power-down, a specifi

40、c sequence is generally followed: isolation, state retention, power shut-off (见下图). For the power-up cycle, the opposite sequence needs to be followed. The power-up cycle can also require a specific reset sequence.,4、Cadence Low-Power Flowers,CPF(Common Power Format)是Cadence提出,Silicon Integration In

41、itiative 通过的标准 CPF-based flow中RTL不需修改; The RTL can be instantiated n number of times, and each instance will have a different low-power behavior as specified by the corresponding CPF.,如何利用CPF描述电源管理方案,设计还是按原来的方法进行,电源管理方案由CPF描述 图中pdA,pdB可以Power-Down,其他部分省缺属于pdTop,电压域及Power-Down条件描述 # Define the top do

42、main set_design TOP # Define the default domain create_power_domain name pdTop default # Define PDA create_power_domain name pdA instances uA uC shutoff_condition !uPCM/pso0 # Define PDB PSO when pso is low create_power_domain name pdB instances uB shutoff_condition !uPCM/pso1,隔离和state retention描述 #

43、 Active high Isolation set hiPin uB/en1 uB/en2 create_isolation_rule name ir1 from pdB isolation_condition uPCM/iso isolation_output high pins $hiPin # Define State-Retention (SRPG) set srpgList uB/reg1 uB/reg2 create_state_retention_rule name sr1 restore_edge uPCM/restore0 -instances $srpgList,Leve

44、l-Shifter描述 # Define Level-Shifters in the # “to” domain create_level_shifter_rule name lsr1 to pdB from pdA create_level_shifter_rule name lsr2 to pdA from pdB create_level_shifter_rule name lsr3 to pdTop from pdB create_level_shifter_rule name lsr4 to pdA from pdTop,CPF支持VLSI设计全流程,用RTL完成功能设计 用CPF完

45、成Power Intent描述,CPF语法,RTL与CPF的相容性,CPF的完整性等,低功耗模式下的验证。如:模块电源关闭、重启,保持寄存器等,含CPF进行逻辑综合,对DVFS需多约束文件。,isolation and state retention 的插入使等价性检查更复杂,减小测试功耗 isolation 、 state retention 、Level-Shifer等单元的测试,power switch insertion Power domainaware placement and optimization等等,Power-down模拟下的逻辑模拟,逻辑综合 左上窗口未含CPF, I

46、solation cells to all outputs of power domains Isolation cells to inputs where specified Level shifters to signals crossing voltage domains Replacement of all flops with retention flops where specified,Test For Low-Power,Test For Low-Power,VT = 20 q NA (2B +VBS ) / Ci + 2B + ms Qf / Ci .,二、晶体管堆叠技术,衬

47、底电压Vbb 是如何影响Vth的? N管 P形衬底加负电压时Vth升高、加正偏压时Vth降低 P管 N形衬底加正电压时Vth升高、加负偏压时Vth降低,Vs0 时VG0 相当于VGSVs,结论 串联堆叠管越多漏流越小 不通的管子位置越低(靠近地)漏流越小 插入高Vth管对降低漏流大有好处 原因,三、双域值晶体管电路,Nodes in critical path,Nodes with low Vth,Nodes with high Vth,a,b,c,d,高Vth升高到一定值时静态功耗反而上升是因为更少的电路可以用高Vth,1Vth,2Vth,uW,63.2,126.3,189.5,双域值晶体管

48、电路的扩充,VTCMOS电路 三端器件和四端器件 两种电路技术 通过SSB和LCM组成反馈回路,补偿Vt的变化 SBB self-substrate bias LCM leakage-current monitor 通过电子开关控制衬底在STANDBY和ACTIVE模式时接不同偏压,四、衬底偏置/可变域值晶体管电路,At 65nm and below, the body-bias effect decreases, reducing the leakage control benefits. TSMC has published information pointing to a factor

49、 of 4 reduction at 90nm, and only 2 moving to 65nm . Consequently, substrate biasing is predicted to be overshadowed by power gating.,VTCMOS电路 三端器件和四端器件 两种电路技术 通过SSB和LCM组成反馈回路,补偿Vt的变化 SSB:self-substrate bias LCM:leakage-current monitor 通过电子开关控制衬底在STANDBY和ACTIVE模式时接不同偏压 SPR:standby power reduction,降低

50、功耗技术汇总,各种功耗优化技术的效果,低功耗物理设计,Floorplanning with multiple power domains Power delivery, through power planning and routing Insertion of power gating for low-power shut-off Placement, including placement of level shifter, isolation, and SRPG cells Optimization, including multiple threshold voltage (Mult

51、i-Vth) optimization, as well as multiple supply voltage (MSV) optimization Clock tree synthesis, ensuring the clock tree is well balanced and optimized for power Efficient routing, because the shorter the route length, the less power is dissipated, while timing and signal integrity must be preserved Analysis and verification, or signoff power analysis, to make sure power consumption is consistent with estimation, and that timing and IR drop are under control,

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 实用文档 > 解决方案

本站链接:文库   一言   我酷   合作


客服QQ:2549714901微博号:道客多多官方知乎号:道客多多

经营许可证编号: 粤ICP备2021046453号世界地图

道客多多©版权所有2020-2025营业执照举报