1、第十二章,后端设计,Outlines,Backend Design Flow Floorplan Place & Route Physical Verification Signal Integrity DFM/DFY,Steps of Backend/Physical Design,Synthesis Floor Planning Placement Scan chain insertion and re-ordering ( optional ) Clock Tree Synthesis Routing Parasitic and netlist extraction Power anal
2、ysis Signal integrity checking Final timing analysis (STA and simulation) ECO (optional) LVS/DRC Export GDSII LVS/DRC using sign-off tools,Backend Flow with ECO,Engineering Change Order (ECO) Achieved by adding small amount of cells in limited area, sizing buffers and routing the connections Prevent
3、 disturbing the placement and routing of the rest of the chip,Keep in mind: Performance, Power, Size, Reliability It is not impossible to develop “plug & Play” tools,Floorplanning,Based on netlist, create areas of functionality on your chip Determine the placement of blocks Determine the placement o
4、f I/O pins Determine the power supply strategy Give feedback on how easy your floorplan might be to wire (Global routing) and how big the chip is,Chip Floorplanning Considerations,Chip level floorplanning High speed block issue Location affect the timing performance Analog block issue clean Vdd/Vss;
5、 minimal spacing to digital block; IO location Die size issue Pin limited; Core limited Power-Ground routing issue Power ring width according to power analysis Power strip/mesh spacing Pin placement and IO ring issue (will be talked in next class) Pad pitch vs. bounding rule; ESD; noise isolation;,D
6、ie Size Issue cont.,Determine the area for standard cells“Utilization” 70% ? 80%? 90%? Extra space for clk tree synthesis Extra space for scan chain Layers for routing,Hard Macro Placement,Macros are generally placed around the peripheral I/O ring A contiguous area for standard cells. Higher freedom
7、 for your place-and-route tools during placement and routing of the standard cells The goal of macro placement is to: Reduce timing-critical paths between the macros and interfacing logic. Reduce interconnections in the following order:Chip I/O to macrosMacro to macro Macro to standard cell blocks,P
8、ower/Ground Development,IR Drop and Electromigration Power-net IR drop degrades the supply voltage level Excessive current density in metal wire causes electromigration failure which breaks metal connection More significant IR drop effect when Vdd gets smaller Higher current density when metal wire
9、width is smaller,Power/Ground Development-cont.,Ring structure Power rings around all layout blocks Major power trunks between layout blocks Difficult to guarantee the worst IR drop Strap structure Simple, easy for routing Mesh structure Evenly distribute of IR Spacing of Power strips consideration
10、IR drop analysis Fix the problem in early stage,P/G Structures,Be Ware of Maximum Width Rule,Maximum wire width limit due to thermal stress and local density rules Slotting vs. “bus” of thin wires Disadvantage of slotting: Slots may not be aligned with current flow True IR drop not known until after
11、 slotting Especial happen for Power/Ground rings,Commonly used for power/ground,Placement,Based on a given floorplan, determine the location of cells in a given netlistGoals & objectives Routability Guarantee the router can complete the routing step (Global routing) Timing Minimize all critical net
12、delays Minimize die size Make the chip as dense as possible Signal Integrity,Check feasibility of routing after placement Logic effort - for those paths with positive slack, reduce cell size,Congestion and Fix,Before,After,Congestion areas,Routing,Complete power/ground/clock routing (clock tree synt
13、hesis) Complete detailed wire routing, conform wiring rule and order) Improve the density Minimize the layer changes Improve critical path and meet timing requirement Produce a routed design free of DRC/LVS violations,General Routing Flow,Clock Tree Synthesis Add buffers/inverters, minimize clk skew
14、 and delay Post Placement Optimization (PPO) Fix setup violation Pre-Route Standard Cells VDD/VSS rails on metal 1 Verify PG connection and routing Route Group Net clocks bus routing Post-Route CTO Fix clk skew and insertion delay Global Routing critical path long wire, interconnection,Routing flow
15、cont.,Track Assignment & Detail Routing Wire connection Search & Repair (DRC/LVS) fix routing violation (unconnected nets, shorts) Post Route Optimization Fix timing Coarse LVS & DRC checking metal width, notch & gap checking Data Output stream out: gds2 format verilog out: hierarchy (PT) / non-hier
16、archy (for Hercules) parasitic out: spef format (cell view),Clock Tree Synthesis,Objective: minimize clock skew optimize clock buffers,Basic CTS Flow & Concepts,Clock Constraint,Define: Clk source: root pin, target insertion delay, target transition time at clk port Clk endpoint: Synchronous pin, ig
17、nore/exclude pin Driving cell, clk cell, delay cell: buffers, inverters, special clk cells DRC: maximum transition delay, maximum net capacitance, maximum fanout, clk number of buffer levels,Clock Skew,Global Skew and Local Skew Global skew Global skew is the clock arrival time difference between an
18、y two flip-flops. Local skew Local skew is the clock arrival time difference between two flip-flops that are adjacent through combinational logic.,Concept of Useful Skew,Useful skew is a method of intentionally skewing a clock to improve the timing on a circuit. It is also commonly used in ECO,Warni
19、ng: Could cause problem in DFT scan insertion,Use CTS for High-Fanout Net Synthesis,High-fanout pins: rest, scan_en Need to balance high-fanout pins to guarantee the functionality Using CTS tool: high-fanout nets by inserting a balanced buffer tree To minimize both skew and insertion delay But shoul
20、d avoid using large buffers for power saving,Large SoC Clock Distribution,Partition the design to several blocks CTS for each block Clk tree network at top level,External clock,Core Internal Clock Net,Global Clock Net,H Tree for Top Clock Network,Use big buffer to balance delay and clk skew Equal di
21、stance, equal loads, equal driving ability,Clock Distribution Case Study: Pentium Spines,Kurd et al., A multigigahertz clocking scheme for the Pentium 4 microprocessor, JSSC2001,Clock Distribution Case Study: Intels Itanium H Tree Clocking,Tam et al., Clock generation and distribution for the first
22、IA-64 microprocessor, JSSC 2000,Issues,Large amount of clock buffers added on clock tree Power consumption Noise to supply lines Reduce power consumption Wide wire widths Clock gating cell placement Limitation of using large clock buffer cells Reduce noise Special clock buffer cells with decoupling
23、capacitor,Extraction,When complete detailed route Write out the hierarchical netlist and parasitic for back annotation Data management on huge file of extracted parasitic data Accurate RC and timing model for nanometer design Width and spacing dependence Resistance shielding Local density effect,SDF
24、 Back Annotation,Used in cell-based design flow Performs delay calculation on parasitic RCs in interconnect wires DSPF - Detailed Standard Parasitic Format SPEF Standard Parasitic exchange Format SDF - Standard Delay Format used for post-layout simulation Can be convert from PrimeTime,Physical Verif
25、ication,DRC - Design Rule check Verify the manufacturing rules, example: Internal layer checks Wide metal checks Metal slotting needed for wide metal Layer-to-layer checks DFM/DFY Example: Antenna Rule Check LVS Layout vs. Schematics Compare layout to schematics- every cell and net,DRC Trends and Ch
26、allenge,75% time on metal layer and via check ERC-type checks increasing Rise of pre-tapeout DFM utilities,LVS,Layout vs. Schematic (LVS) Check physical layout against functional gate level schematic to ensure all intended connectivity has been maintained Steps: Extract the netlist from layout (GDSI
27、I) Compare the netlist with the one after routing and optimization Hints: Most of LVS errors are caused by manual layout or congestion “Virtual connect” (connected by text) could cause a killer failure,Signal Integrity,Signal Integrity is the ability of a signal to generate correct response in a cir
28、cuit Signal has digital levels at appropriate and required voltage levels at required instants of time Crosstalk, IR Drop, Electromigration,Layout Parasitic vs. Circuit Performance,Interconnect parasitic resistors, capacitors and inductors cause extra timing delay Additional power consumption caused
29、 by parasitic RC Inter-wire capacitances cause coupling noise and will dominate interconnect wire delays Parasitic resistances in power supply cause voltage drop and may degrade circuit performance Higher current density in power net may cause electromigration failure,Inductance Effects,Inductive co
30、upling effect is significant for long interconnects and for very fast signal edge rate Inductive coupling is negligible at short trace interconnects, since the edge trace is long compared to the flight time of the signal Inductance extraction and simulation are more difficult than capacitance,C,L,Cr
31、osstalk Analysis,Definition Aggressor: generating crosstalk Victim: receiving crosstalk Timing sensitive Crosstalk analysis consisting signal transition timing window can eliminate pessimistic delay calculation The crosstalk spike is related to capacitance value and the victim driver impedance,Cross
32、talk Analysis cont.,Timing sensitive,Crosstalk Prevention,Prevent crosstalk from synthesis stage Minimize the driving size on those non-critical path to reduce the number of aggressors Apply max transition time (set_max_transition) in physical synthesis/placement to avoid long nets,Crosstalk Prevent
33、ion cont.,From routing stage Effective spacing between noise region and quite region Shielding between critical paths,Crosstalk Prevention cont.,From routing stage cont. Buffer insertion Inserted buffer breaks up the coupling capacitance of long wire,Crosstalk Prevention cont.,From routing stage con
34、t. Buffer sizing Increase the driver size of victim Decrease the driver size of aggressor Track reordering Track reordering is based on timing window,Crosstalk Prevention cont.,For inductance crosstalkCoplanar ShieldsReference PlanStagger Inverter/Buffer,Electromigration Effects,The electrons flow t
35、hrough the wires and collide w/metal atoms, producing a force that causes the wires to break Caused by the high current densities and high frequencies going through the long, very thin metal wires MTTF (Mean Time To Failure) increases when current density and temperature increase Can be eliminated b
36、y using the appropriate wire sizing,Fix EM,Controlling current density to limit electromigration failure is needed in design and verification Layout optimization: Increase the power line width, layer Increase the power pads Increase the connection Issues More metal (add 8% cost per layer) Larger, sl
37、ower designs (grow in x and y),Other Considerations,ESD (will be talked in next class) Package vs. performance (will be talked in next class) DFM/DFY,DFM/DFY,90nm and below technologies challenges in yield DFM Design for Manufacturability DFY Design for Yield,DFM and DFY,DFM is the management of tec
38、hnology constraints (sizing rules) applied to the layout A manufacturable design however is not necessarily a high-robust or high-yielding design. DFY, as part of Design for Manufacturability, concentrates on the development and quality of the circuit design in the pre- and post-layout phase. DFY is
39、 the management of design sensitivities to the manufacturing process and helps to guarantee high-yielding devices,DFM/DFY Methodology,Optimal resolution enhancement technology (RET) Mask and exposure Optical Proximity correction (OPC) Phase Shaft Mask (PSM) Yield enhancement and optimization technol
40、ogy DFM rules implementation To overcome limits of OPC Yield checking during the layout stage Supported by EDA tools,Why Need RET?,Wavelength used vs process generation,Design for Manufacturing,Not all the things can be done by mask and exposure: Corrections are not complete Some designs cannot be b
41、uilt at all with certain RET technologies Of those that CAN be built, some are more manufacturable after RET than others DFM/DFY-driven routing OPC-driven routing PSC-driven placement DFM rule implementation,DFM/Y Rules,Limit the use of minimal poly-enclosed gates, minimally enclosed vias and singly
42、 contacted lines Better yield Less resistance Example: Via Void rules - doubled vias,Current DFM/Y Design Flow Supported,Load Design,Perform antenna fixes,Add contacts/via,Metal Fill & Slotting,Verify LVS and DRC,Why Need Double Vias?,Copper processing causes new problems for vias Voids in Cu migrat
43、e under thermal stress towards vias If enough voids migrate to a via it can cause failure Worse at 90/65nm due to increased stress of smaller via,Voids can migrate long distances 10 microns,Voids can migrate around corners,Yield vs. Area,Antenna Rules,Antenna rules have nothing to do with traditiona
44、l definition of antenna Really a collector of static charge, not electromagnetic radiation Antenna problem only happens during manufacturing Plasma-based process for etching, oxide deposition Plasma etcher include a voltage into floating wire, stressing the thin gate-oxides Not a new problem, but su
45、b-100nm materials may make it a lot worse,Antenna Effect,Depend on the gate size and the length of the wire The metal or poly leads act like an antenna- collect the charges (negative charge) Resulting in gate oxide breakdown,Driver (diffusion),Poly gate,M1,Driver (diffusion),Poly gate,M1,M2,Fix Ante
46、nna Rule violation,Antenna effect check is part of DRC / ERC Solutions: Adding antenna diodenear the input pin to provide a conducting path to GND Adding jumper to minimizes the amount of charge collected by a floating node Adding buffer to cut the wire and provide a discharge path,Metal Filling,A n
47、arrow metal wire separated from other metal receives a higher density of enchant than closely spaced wires The narrow metal can get over-etched Change thickness of metal line Minimum metal density rules are used to control this Fills empty tracks with metal shapes to meet the minimum metal density r
48、ules,Metal Filling cont.,Caution: metal fill changed the parasitic Width and spacing dependent Need smart parasitic extraction Timing driven metal fill,Problem Facing at 90nm and Below,DFM techniques such as wire spacing, wire widening, redundant via insertion, metal fill impact crosstalk and timing significantly The era of interconnection synthesis,Trend of Backend Tools,Considering timing, area, power, DFM/DFY at one time,Thank you,