Design to Tight Power Supply Requirements FTF-NET-F0036 Chuck Corley | DMTS Mohit Kedia | Engineering Rotation Program APR.2014 TM External Use Abstract: Design to Tight Power Supply Requirements • Session Length: 2 hours • Freescale has begun specifying core supply voltages with ±30 mV tolerances. Customers are accustomed to ±5% and are asking questions about how to achieve this tighter requirement. This presentation will discuss the specification and what customers need to know for successful designs. TM External Use 1 Agenda • Defining the problem −3% DC voltage requirement −Time versus frequency domain • VDD/PLAT Voltage Specification for T (28nm) series parts • Current step observations for T4240RDS • Current step observations for T1040QDS • Discussion of current slew rate TM External Use 2 Defining the Problem Requirements • Power Supply must supply a stable voltage reference • Power Supply must distribute adequate current Observations: • Switching power supplies actually supply a digitally varying voltage (~500 KHz) • Microprocessor’s current demand may vary as fast as core frequency (~2GHz) • Power Distribution Network (PDN) has resistance, capacitance, inductance, mutual capacitance, and mutual inductance through PCB, socket, vias, and capacitors. • Changes in current at a particular frequency causes voltage changes at that frequency across these impedances. Problem: • Silicon vendors are tightening the voltage specifications while the current continues to increase. TM External Use 3 SOCs incorporating the e6500 core in 28nm e6500 core-based parts T4240 T4160 B4860 T2080 T2081 B4420 E6500 cores/threads 12/24 8/16 4/8 4/8 4/8 2/4 Max core frequency (Hz) 1.66G 1.8G 1.66G 1.8G 1.8G 1.6G Clusters/ L2 per cluster 3/2MB 2/2MB 1/2MB 1/2MB 1/2MB 1/2MB DDR3/3L Memory controllers 3 2 2 1 1 1 CPC (L3) cache per controller 512KB 512KB 512KB 512KB 512KB 512KB DMA controllers/channels 2/8 2/8 2/8 3/8 3/8 1/8 StarCore SC3900 FVP core subsystems NA NA 6 NA NA 2 NA NA 3/2MB NA NA 1/2MB 1932 FCPBGA, 45 mm x 45 mm, 1mm pitch 1932 FCPBGA, 45 mm x 45 mm, 1mm pitch 1020 FCPBGA, 33 mm × 33 mm, 1mm pitch 896 FCPBGA, 25 mm x 25 mm, 0.8mm pitch 780 FCPBGA, 23 mm x 23 mm, 0.8mm pitch 1020 FCPBGA, 33 mm × 33 mm, 1mm pitch StarCore Clusters/ L2 per cluster Package TM External Use 4 Tight Core Voltage Specifications for 28nm e6500 core-based parts T4240 T4160 B4860 T2080/81 B4420 Core and platform supply Voltage - startup 1.05 V ± 30 mV 1.05 ± 30 mV 1.05 V ± 30 mV 1.025 ± 30 mV 1.05 V ± 30 mV Core and platform supply Voltage – normal operation VID ± 30 mV VID ± 30 mV VID ± 30 mV VID ± 30 mV VID ± 30 mV Operation at 1.1V is allowable for up to 25ms at initial power on. footnote 6 footnote 6 footnote 6 footnote 3 footnote 5 Voltage ID (VID) operating range is between 0.95V to 1.05V. Regulator selection should be based on Vout range of at least 0.9V to 1.1V, with resolution of 12.5mV or better. 0.9V but changing to 0.95 0.9V but changing to 0.95 footnote 1 footnote 7 0.9751.025 0.9V Section 4.2.2 Section 4.2.2 S3.2.2: Section 4.2.2 10A step Footnote 4; S3.2.2: +50/-30 mV 1200MHz; +100mV transient; 20A step it is recommended that the system designer place at least one (0.1μF) decoupling capacitor at each VDD, VDDC, CVDD, OnVDD, DVDD, EVDD, GnVDD, and LnVDD pin of the device. Section 4.3 Section 4.3 Section 3.3 Section 4.3 Section 3.3 Spec Rev Rev G Rev D Rev H Rev E/D Rev C …maintain the transient power surges to less than +50 mV (negative transient undershoot should comply with specification of VID-30mV) for current steps of up to 20 A for 12 cores, 15A for 8 cores and 10A for 4 cores with a slew rate of 12 A/us. TM External Use 5 ± 30 mV; no step spec’d SOCs incorporating the e5500 core; some 28nm e5500 core-based parts P5020/10 P5040/21 T1040/42 T1020/22 2/1 4/2 4 2 Max core frequency (Hz) 2.0GHz 2.2GHz 1.4G 1.4G L2 cache per core 256KB 512K 256KB 256KB Memory controllers 2 2 1 1 1MB 1MB 256KB 256KB 2/4 2/4 2/8 2/8 1295 FCPBGA, 37.5 mm × 37.5 mm, 1mm 1295 FCPBGA, 37.5 mm × 37.5 mm, 1mm 45nm 45nm E5500 cores CPC (L3) cache per controller DMA controllers/channels Package Technology TM External Use 6 780 FCPBGA, 23 mm x 23 mm, 0.8mm 28nm 780 FCPBGA, 23 mm x 23 mm, 0.8mm 28nm Tight Core Voltage Specifications for e5500 & 28nm e5500 core-based parts P5020/10 P5040/21 1.0 ± 50mV(core frequency = 1200 MHz) 1.1V ± 50mV (core frequency > 1200 MHz) 1.1 ± 50mV (core frequency ≤ 2000 MHz) 1.2V ± 30mV (core frequency > 2000 MHz) 1.025 ± 30 mV Operation at 1.1V is allowable for up to 25ms at initial power on. NA NA footnote 5 Voltage ID (VID) operating range is between 0.975V to 1.025V. Regulator selection should be based on Vout range of at least 0.9V to 1.1V, with resolution of 12.5mV or better. NA NA footnote 7 …maintain the transient power surges to less than +50 mV (negative transient undershoot should comply with specification of VID-30mV) for current steps of up to 20 A for 12 cores, 15A for 8 cores and 10A for 4 cores with a slew rate of 12 A/us. NA NA Section 4.2.2 10A step …at least one (0.1μF) decoupling capacitor at each VDD, VDDC, CVDD, OnVDD, DVDD, EVDD, GnVDD, and LnVDD pin of the device. Section 3.4 0.01 or 0.1μF* Section 4.3 0.01 or 0.1μF* Section 4.3 Spec Rev Rev 0 Rev 0 Rev E Core and platform supply Voltage - startup Core and platform supply Voltage – normal operation T1040/42/20/22 VID ± 30 mV Better to use largest capacitance that will fit on footprint under the part. TM External Use 7 What is Voltage ID (VID) for 28nm Products? • • A specific method of selecting the optimum voltage-level to guarantee performance and power targets. − QorIQ device contains fuse block registers defining required voltage level. This EFUSE definition is accessed through the Fuse Status Register (DCFG_FUSESR). − Customer system must use the VID to change the voltage regulators in the system in a reliable and safe methodology. QorIQ Chassis Architecture Specification, Generation 2 Revision 0.9 defines the general EFUSE definition. − A set of 24 efuses ([0-23]) that determine the speed bin and voltage requirements for the device domains. − The range and steps are much more flexible than actually needed by manufacturing; only the fuses necessary to provide the required voltages will be implemented. TM External Use 8 Voltage Specification Terms Better Defined Tolerance VID +50mV / -30mV VID or DCSetPoint vDD Overshoot Undershoot Principal Silicon Concern Step-up IDD Switching Ripple Step-down Load-Step time TM External Use 9 Power Distribution System Theory – VRMs • Voltage Regulator Modules (VRMs) use feedback to hold a constant supply voltage (up to the frequency of the inherent low pass filter). • QorIQ parts allow feedback from the die voltage plane – SENSEVDD • T4240QDS Intersil VRM (typical of most VRMs) advertises ±0.5% Closed-loop System Accuracy Over Load, Line and Temperature [for transients < 1/3 (to 1/5) of switching frequency – 350-500kHz]. + - LPF Bulk Caps Vref Planes Bypass Caps From Intel VRM 11.1 TM External Use 10 SENSEVDD_N SENSEVDD_P VID ± 30 mV +12V ST VRM Model PDN System LF SB ESR Bulk Caps Vref TM External Use 11 Mult 22 to 1000uF caps Bypass Caps VDD ~One Planes 0.1uF per pin PKG DIE Power Distribution System Theory - Ripple • The most common meaning of ripple in electrical science is the small unwanted residual periodic variation of the direct current (dc) output of a power supply which has been derived from an alternating current (ac) source. This ripple is due to incomplete suppression of the alternating waveform within the power supply. Voltage VRIPPLE – P-P at Bulk Capacitors PWM current spikes from +12V supply when ST conducts time TM External Use 12 Power Distribution System Theory – AC Impedance • Inductance in the traces and vias (and socket pogo pins) create an AC impedance (ZS) that causes dv/dt changes at the load with varying di/dt. • These dv/dt changes would “ride” on any DC voltage droop. • Decoupling capacitors and capacitive plane layers are added to reduce the AC impedance between VDD and GND. + - SENSEVDD_N SENSEVDD_P ZS + - + DC IL VL - Vref VS = 1.00 V LPF 30ea 22uF Bulk Caps TM External Use 13 83ea 0.1uF Planes VDD DIE Bypass Caps - Reactive Elements in the PDN cause dv/dt • Well documented problem (see references slide) • Silicon vendors are tightening the DC specifications at lower supply voltages. • Customers are demanding more information from silicon vendors to aid in designing compliant power supplies (Power Distribution Networks or PDNs). TM External Use 14 The PDN Problem in the Frequency Domain ? Total Impedence VS Frequency (Log Scale) 1.0E+02 Board level PDN design 1.0E+01 Z_total (Ohms) On-chip, package Cut-off Z_Pkg Z_Die Impedance (Ohms) (Log Scale) 1.0E+00 VRM 1.0E-01 1.0E-02 P5020 50mV 1.0E-03 ΔV(f)/ ΔI(f) =Ztarget T4240 3% 1.0E-04 1.0E-05 1.E+04 Frequency (Hz) TM 1.E+05 External Use 1.E+06 15 1.E+07 1.E+08 1.E+09 Power Distribution System Design • A common rule-of-thumb (in absence of better di/dt data from the vendor) is to assume that Δi is 50% of max power/nominal voltage (50% of 67W/1.0V = 34A). Δv for the same calculation would be the AC variance allowed (30 mV for the T4240). • Z = Δv/Δi = 0.88 mΩ Z (Ω) 1.0000 0.1000 0.0100 Target Impedance 0.0010 0.0001 1 Hz 10 Hz 100 Hz 1 kHz 10 kHz TM External Use 16 100 kHz 1 10 MHz MHz 100 1 MHz GHz Latest T4240 Voltage Specifications Core and Platform Supply Voltage – VID (or 1.05V bootup) ± 30 mV • Supply voltage measured at the voltage sense pins • Combined DC and AC variance from nominal not to exceed ±30 mV except for an overshoot of less than +50 mV during transients. Transient voltages may result from current steps of up to 20A with slew rates of 12 A/us max. WHAT THIS MEANS: • Voltage regulator will boot up to 1.05V and then software should adjust VR to VID to comply with power specification. • Voltage regulator is assumed to hold the DC Set Point – as measured at SENSE_VDD pins – to very small error (VID ±10 mV?) • Switching voltage regulator ripple is suppressed to within a very small range (VID ±20 mV?) • Load step transients are suppressed by capacitance to VID +50mV and VID 30mV. Overshoot is judged to be harder to suppress than undershoot. Overshoot is also less of a concern to the processor. • Load step varies with program activity on the processor. Worst case on T4240 is 20A for 23 virtual cores alternating between PH10/PH20 power saving state and L1-resident, intensive computation with AltiVec. • TM External Use 17 How to check for spec compliance? • Check VRMS value between SENSEVDD and SENSEGND with a True-RMS DMM. • Check ripple and load step transients between SENSEVDD and SENSEGND with a differential probe and the oscilloscope set for 20MHz bandwidth offset and zoomed into a 20mV/DIV range… • …while running your worst case application software. (From suggestions by VRM suppliers.) • Power-up current-step transients should not be a problem because the cores are released from boot hold-off one at a time – so we don’t have to measure there. • Power state changes after boot-up can be programmatically controlled – so it should be possible to reduce Δt if necessary. (Input from IC designers.) TM External Use 18 Voltage Observations TM External Use 19 Load Board pattern looping - SENSEVDD - avg dhrystone power pattern from vector 3 to end of pattern – 25C -1800 MHz. Sync at vector 369. Average of 16 captures shows: SENSEVDD AC: +21 mV / -28mV dhrystone complete plat config & dma syste m 11 A por 18 A -70 mV undershoot <10mV ripple ~28 mV undershoot ~28 mV overshoot 9A 1.15ms 1.4ms 2.15ms 2.43ms SENSEVDD remains constant despite increased current demand but spikes at steps TM External Use 20 Load Board pattern looping - VDD – avg DC dhrystone power pattern from vector 3 to end of pattern – 25C -1800 MHz. Sync at vector 369. Average DC shows: VDD: 1.023V +36 mV / -29mV 11 A VDD adjusts upward18to compensate for increased current demand A 9A TM External Use 21 ΔV on the T4240RDS w/24 cores running Dhrystone on Linux 1 Sample, 200MHz filter This could be caused by the die, the board, the electric lights on the bench, or the atmosphere. Not sure which. Probably not the power supply. TM External Use 22 ΔV on the T4240RDS w/24 cores running Dhrystone on Linux 1 Sample, 20MHz filter Event occurring every 4 ms 5 ms TM External Use 23 T4240RDS w/24 cores running Dhrystone on Linux 1 Sample, 20MHz filter, triggered by “the event” 23 mV overshoot 18 mV undershoot 10 µs occurs every 4 ms TM External Use 24 Believe this is caused by a current step on the die. But hard to tell in Linux so will develop our own controlled test case. Creating a Current Step TM External Use 25 Core + Platform Current from data sheet for e6500 SOCs e6500 core-based parts T4240 r2 T4160 r2 T2080* T2081* Maximum 1867/800/1867/66 @ 105C 63A 53A ~27.3A ~26.6A Thermal 1867/800/1867/66 @ 105C 54A 46A ~25.2A ~24.2A Typical 1867/800/1867/66 @ 65C 37A 31A ~14.1A ~13.3A Maximum 1667/733/1867/66 @ 105C 61A 50A Thermal 1667/733/1867/66 @ 105C 52A 44A Typical 1667/733/1867/66 @ 65C 34A 28A Maximum 1500/667/1600/66 @ 105C 50A 40A ~21.2A ~20.5A Thermal 1500/667/1600/66 @ 105C 42A 35A ~19.4A ~18.7A Typical 1500/667/1600/66 @ 65C 30A 25A ~12.3A ~11.6A 16.7A Maximum 1200/533/1600/66 @ 65C Typical power assumes Dhrystone running with activity factor of 60% (on all cores) and is executing DMA on the platform with 100% activity factor Thermal power assumes Dhrystone running with activity factor of 60% (on all cores) and executing DMA on the platform at 100% activity factor. Maximum power assumes Dhrystone running with activity factor at 100% (on all cores) and is executing DMA on the platform at 115% activity factor. *1800/700/2133/66; 1533/600/1867/66; 1200/533/1600/66 TM External Use 26 Core + Platform Current from data sheet for e5500 SOCs e5500 core-based parts P5020 P5010 P5040* P5021 Maximum 2000/800/1333/66 @ 105C 27.3A 22.7A 40.0A 28.2A Thermal 2000/800/1333/66 @ 105C 25.4A 21.8A 38.2A 27.3A Typical 2000/800/1333/66 @ 65C 14.5A 12.7A 26.4A 19.1A Maximum 1800/700/1300/66 @ 105C 25.4A 20.9A 38.2A 27.3A Thermal 1800/700/1300/66 @ 105C 23.6A 20.0A 37.3A 26.4A Typical 1800/700/1300/66 @ 65C 12.7A 10.9A 24.6A 18.2A Maximum 1600/600/1200/66 @ 105C 20.9A 17.3A ~6.4A Thermal 1600/600/1200/66 @ 105C 20.0A 17.3A ~6.0A Typical 1600/600/1200/66 @ 65C 11.8A 10.9A ~4.2A Maximum 1200/600/1200/66 @ 65C 18.0A 15.0A 5.8A T1040** Typical power assumes Dhrystone running with activity factor of 60% (on all cores) and is executing DMA on the platform with 100% activity factor Thermal power assumes Dhrystone running with activity factor of 60% (on all cores) and executing DMA on the platform at 100% activity factor. Maximum power assumes Dhrystone running with activity factor at 100% (on all cores) and is executing DMA on the platform at 115% activity factor. * 2000/700/1333/66; 1800/600/1200/66 TM External Use 27 **1400/600/1600/66; 1200/500/1600/66 ΔI on the T4240 load board at 25C ambient dhrystone power pattern from vector 3 to end of pattern – 25C -1800 MHz. 11 A System Por 9A 18 A Dhrystone Core Boot Plat config and dma running 1.15ms 1.4ms TM External Use 28 2.15ms pattern stopped What is the current demand of the die wrt time? • Static timing requires paths to finish inside 1 cycle. (most paths) • For e5500 on P5020, the core was timed to 460ps – very small dt! • More likely current can’t change dramatically in less than 4–6 core clocks and that would be rare worst case. % of Paths Still Toggling After Clock Edge at t=0 (blue) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 50 100 150 TM External Use 29 200 250 300 350 400 450 500 Worst Case AC Current Stimulus Goal • IPI vect = 1 IPI vect = 0 power intensive instructions IPI vect = 1 decr intrpt IPI vect = 0 minimal power wait instruction decr intrpt Time power intensive instructions decr intrpt High minimal power wait instruction Low decr intrpt Current Programmatically cause the actual die to represent a variable load at controlled frequencies. • Observing 23 cores to change from wait to intensive compute within 5 core clocks of one another (3ns at 1.67GHz) Max frequency = platform clk/16 Voltage GPIO4[3] signal for o’scope sync CONFIRMED 23 THREADS IN PH10 DURING MINIMAL POWER (using TWAITSR0)! TM External Use 30 Wait for Interrupt Instruction • • wait stops synchronous processor activity…until an asynchronous interrupt …occurs. The processor may use this to reduce power consumption. When an interrupt occurs while the processor is waiting, its associated save/restore register 0 will point to the instruction following the wait. Current • • • Core frequency stays constant. Imax HI Power switches from HI to LO and back on decrementer interrupt. Hypothesis: current is constant for HI and LO at all decrementer frequencies. CmaxVf CwaitVf LO Istatic 0 0 TM External Use 31 f 1333 1500 1600 Power Management Fundamentals • CMOS Energy Consumption − Dynamic Energy Consumption − Static Energy Consumption TM External Use 32 Fast Current Step for T4240 • Inter-processor interrupt causes all 23 cores to switch from wait to power intensive within 5 core clocks (3ns). • On-die current slew ~6000A/us Normal Distribution of Time/Core for Change of State from Wait to Run 35% % of sample's Percentage of sample's 30% Std Dev: 0.77nS Median: 139.8nS Slope: ~6000A/us 25% 20% 15% 10% 5% 0% 0 25 50 TM External Use 33 75 100 125 TIme in Nanoseconds 150 175 200 Slower Current Step for T4240 • Inter-processor interrupt sent sequentially to each of 23 cores with an intervening delay (3 instructions) caused a switch from wait to power intensive within ~500 core clocks (300ns). • On-die current slew ~60A/us Normal Distribution of Time Taken/Core for Change of State from Wait to Run 35 Percentage of sample's 30 % of sample's 25 Std Dev: 118.8nS Median: 310.9nS Slope: ~60a/us 20 15 10 5 0 0 50 100 150 200 250 TIme in Nanoseconds TM External Use 34 300 350 400 What is the Correct Use Case for Current Step? TM External Use 35 What use case for current step to max power? • • • • • DHRY: Dhrystone (entirely integer code) FXSC6/12/15: Scalar fixed-point radix-two, in-place DFT 2n points* (all integer) FPSC6/12/15: Scalar floating-point radix-two, in-place DFT 2n points (add SPFP) FXAV6/12/15: Vector fixed-point radix-two, in-place 2n points DFT (SIMD 8 shorts) FPAV6/12/15: Vector floating-point radix-two, in-place DFT (SIMD 4 SPFP) • Core 0 to continuously control and report current from I2C • Combinations of thread 1 through thread 23 running separate copies (AMP) of above benchmarks. − − − • 3 clusters, 12 cores, 23 threads for T4240 2 clusters, 8 cores, 15 threads for T4160 1 cluster, 4 cores, 7 threads for T2080-like part PCL10 cluster power-saving state for inactive clusters. * Where n = 6/12/15 TM External Use 36 Performance Metrics for Selection of Use Cases BenchMark IPC CLKs FP/i% AV/i% IL1M/i% DL1M/i% L2HIts DHRY 0.62 1492 0 0 0 0.2% 40 FXSC6 (N=64) 0.18 7.37M 0 0 2.0% 0.0% 54.2K FPSC6 (N=64) 0.18 7.66M 0.1% 0 2.0% 0.0% 56.4K FXAV6 (N=64) 0.18 7.34M 0.0% 0.04% 2.0% 0.0% 53.8K FPAV6 (N=64) 0.18 7.64M 0 0.04% 2.0% 0.0% 56.2K FXSC12 (N=4K) 0.53 12.74M 0 0 0.4% 0.0% 169.7K FPSC12 (N=4K) 0.38 11.55M 5.3% 0 0.7% 0.0% 150.3K FXAV12(N=4K) 0.26 8.86M 0.0% 3.2% 1.2% 0.0% 97.0K FPAV12 (N=4K) 0.30 9.76M 0 3.0% 1.1% 0.0% 125.0K FXSC15(N=32K) 0.92 61.46M 0 0 0.2% 0.0% 1301.7K FPSC15 (N=32K) 0.67 45.93M 7.7% 0 0.8% 0.0% 1096.7K FXAV15 (N=32K) 0.52 20.74M 0 7.0% 0.6% 0.0% 504.0K FPAV15 (N=32K) 0.60 27.32M 0 5.5% 1.2% 0.0% 841.4K TM External Use 37 T4240 Current Step Observations TM External Use 38 T4240 r1 Current Measurement – Dhrystone on 12 cores • T4240RDB with International Rectifier 3565A VR. • Dhrystone: 46A to 59A step in ~3ns at 1.0V ~105C. • Max undershoot and overshoot <30mV @ 1.05V Current change in T4240 rev. 1 from wait to full power on IPI interrupt 12 cores/24 threads Dhrystone 120 Current 110 Current (Ampere's) Diode1 Diode2 55 100 Temp Controller 90 50 80 70 45 60 50 40 40 Frequency: 0.1Hz 35 30 0 100 200 300 400 Time (seconds) TM External Use 39 500 600 700 Temperature (C) 60 Consult the HW spec for actual max power numbers! T4240 r2 Current Measurement – Dhrystone on 12 cores • T4240RDB with International Rectifier 3565A VR. • Dhrystone: 34A to 48A step in ~3ns at 1.0V ~105C. • Max undershoot and overshoot <30mV @ 1.05V Current change in T4240 rev. 2 from wait to full power on IPI interrupt 12 cores/24 threads Dhrystone 50 Current (Ampere's) 130 Current Diode1 120 Diode2 110 Temp Controller 45 100 90 40 80 35 70 60 30 50 25 Frequency: 0.1Hz 40 20 30 0 100 200 300 400 Time (seconds) TM External Use 40 500 600 700 Temperature (C) 55 Consult the HW spec for actual max power numbers! T4240 r1 Load Step – AltiVec on 12 cores • T4240RDB with International Rectifier 3565A VR. • AltiVec FP FFT: 18A max step in ~3ns at 1.0V ~105C. • Max undershoot and overshoot <30mV @ 1.05V Current change in T4240 rev. 1 from wait to full power on IPI interrupt 12 cores/24 threads FFT 4096 pts Altivec Floating Point 65 120 Current Current (Ampere's) 60 Diode2 100 Temp Controller 55 90 80 50 70 45 60 50 40 40 Frequency: 0.1Hz 35 30 0 100 200 300 400 Time (seconds) TM External Use 41 500 600 700 Temperature (C) 110 Diode1 Changing the HW spec from 30A step to 20A max! T4020 r1 Current Measurement – Dhrystone on 8 cores • T4240RDB with International Rectifier 3565A VR. • Dhrystone (integer) 45.5A to 53.5A step in ~3ns at 1.05V ~105C. 55 Current 120 53 Diode1 110 Diode2 51 Temp Controller 49 100 90 47 80 45 70 43 60 41 50 39 Frequency: 0.1Hz 37 40 35 30 0 100 200 300 400 Time (seconds) TM External Use 42 500 600 700 Temperature (C) Current (Ampere's) Current change in T4240 rev.1 from wait to full power on IPI interrupt 8 cores/16 threads Dhrystone T4020 r2 Current Measurement – Dhrystone on 8 cores • T4240RDB with International Rectifier 3565A VR. • Dhrystone (integer) 34.5A to 43.5A step in ~3ns at 1.05V ~105C. Current change in T4240 rev. 2 from wait to full power on IPI interrupt 8 cores/16 threads Dhrystone 130 Current Diode1 Current (Ampere's) 45 120 Diode2 110 Temp Controller 100 40 90 35 80 70 30 60 50 25 Frequency: 0.1Hz 40 20 30 0 100 200 300 400 Time (seconds) TM External Use 43 500 600 700 Temperature (C) 50 T4240 r1 Load Step – AltiVec on 8 cores • T4240RDB with International Rectifier 3565A VR. • With AltiVec: 11A max step in ~3ns at 1.05V ~105C. • Max undershoot and overshoot <15mV Current change from wait to full power on IPI interrupt 8 cores/16 threads FFT 4096 pts Altivec Floating Point 120 Current Current (Ampere's) Diode1 110 Diode2 55 Temp Controller 100 90 50 80 70 45 60 50 40 Frequency: 0.1Hz 40 35 30 0 100 200 300 400 Time (seconds) TM External Use 44 500 600 700 Temperature (C) 60 T4020 r1 Current Measurement – Dhrystone on 4 cores • T4240RDB with International Rectifier 3565A VR. • Dhrystone (integer) 45.5A to 49.5A step in ~3ns at 1.05V ~105C. Current change in T4240 rev. 1 from wait to full power on IPI interrupt 4 cores/8 threads Dhrystone 51 120 Current Diode1 110 Diode2 Temp Controller 47 100 90 45 80 43 70 60 41 50 39 40 Frequency: 0.1Hz 37 30 0 100 200 300 400 Time (seconds) TM External Use 45 500 600 700 Temperature (C) Current (Ampere's) 49 T4020 r2 Current Measurement – Dhrystone on 4 cores • T4240RDB with International Rectifier 3565A VR. • Dhrystone (integer) 35A to 39.5A step in ~3ns at 1.05V ~105C. Current change in T4240 rev. 2 from wait to full power on IPI interrupt 4 cores/8 threads Dhrystone 45 130 40 Diode1 120 Diode2 110 Temp Controller 100 35 90 80 30 70 60 25 50 Frequency: 0.1Hz 40 20 30 0 100 200 300 400 Time (seconds) TM External Use 46 500 600 700 Temperature (C) Current (Ampere's) Current T4240 r1 Load Step – AltiVec on 4 cores • T4240RDB with International Rectifier 3565A VR. • With AltiVec: 5A max step in ~3ns at 1.05V ~105C. • Max undershoot and overshoot <15mV Current change in T4240 Rev.1 from wait to full power on IPI interrupt 4 cores/8 threads FFT 4096 pts Altivec Floating Point Current Diode1 Current (Ampere's) 49 110 Diode2 Temp Controller 47 120 100 90 45 80 43 70 60 41 50 39 40 Frequency: 0.1Hz 37 30 0 100 200 300 400 Time (seconds) TM External Use 47 500 600 700 Temperature (C) 51 Measured Step on T4240 RDB Observed current step for combined cores and platform at ~100C, 1.66GHz, 1.05V T4240 (24 cores) Estimate T4160 (16cores) Estimate T2080 (8 cores) Dhrystone 14.5 A 9.0 A 4.0 A Fixed-point DFT 18.0 A 11.0 A 5.5 A Floating-point DFT 18.0 A 12.0 A 5.5 A Vector Fixed-point DFT 18.0 A 12.0 A 5.5 A Vector Floating-point DFT 18.0A 11.5 A 5.5 A Dynamic current step is nearly constant over temperature and core frequency. TM External Use 48 T1040 Current Step Observations TM External Use 49 T1040 Current Measurement – Dhrystone on 4 cores • T1040 with International Rectifier 3565A VR. • Dhrystone: 3.4A to 4.45A step in ~3ns at 1.0V ~Room temp. • Max undershoot and overshoot <30mV @ 1.05V Current change in T1040 from wait to full power on IPI interrupt 4 cores Dhrystone 5 45 Current Diode1 43 4.6 41 4.4 39 4.2 37 4 35 3.8 33 3.6 31 3.4 29 Frequency: 0.1Hz 3.2 27 3 25 0 50 100 150 200 250 300 Time (seconds) TM External Use 50 350 400 450 Temperature (C) Current (Ampere's) 4.8 T1040 Current Measurement – Dhrystone on 4 cores • T1040 with International Rectifier IR36021and IR3550. • Dhrystone: 3.75A to 4.85A step in ~3ns at 1.0V ~85C. • Max undershoot and overshoot <30mV @ 1.05V Current change in T1040 from wait to full power on IPI interrupt 4 cores Dhrystone 5.2 Current 100 Diode1 90 4.8 4.6 80 4.4 70 4.2 60 4 50 3.8 40 3.6 Temperature control via heat gun! 3.4 30 3.2 20 0 10 20 30 40 50 60 Time (seconds) TM External Use 51 70 80 90 100 Temperature (C) Current (Ampere's) 5 Discussion of current slew rate TM External Use 52 What does the on-die current step say about di/dt externally? • • • On-die capacitance and package inductance reduces di/dt at VDD pins. Recommended decoupling caps (0.1uF) on every power pin further reduces it to what the bulk decoupling capacitors have to deal with (spec’d 12A/us). From AN2747: di/dt is a parameter of the silicon die that is essentially hidden by the capacitive and inductive components of the die substrate, the die-local bypass capacitors, the socket (if any) and other parasitics. Consequently, the di/dt parameter used to design the power system is not the di/dt of the processor die … but the filtered di/dt of the combined processor, substrate-resident capacitors and the substrate itself. This di/dt is much slower, as the current demands are initially supplied by the adjacent transistors, die power traces, die substrate and local capacitors. TM External Use 53 Explaining the reduction of di/dt vs decoupling caps (hypothetical example) di/dt 15 A/us di/dt 1350 A/us TM External Use 54 di/dt 3500 A/us di/dt from the tester 110C dhrystone power pattern from vector 369 (system ready) to vector 6000 (platform configured and dma running) - biggest current bump 110C 1800 MHz 1.4 A/μs 18 A TM External Use 55 Is delta Voltage within spec? TM External Use 56 Transient Undershoot and Overshoot on T4240RDS with 18A load step (shown relative to earlier slide) Spec VID or DCSetPoint Tolerance VID +50mV / -30mV vDD Overshoot Undershoot Principal Silicon Concern Step-up IOUT (20A/div) IDD Switching Ripple Step-down Load-Step time TM External Use 57 Load Step with 12 cores for IR3565 on T4240RDB W/AltiVec – 20A Step - ~100C– 1.05V <10mV ripple 18mV undershoot TM External Use 58 Load Release with 12 cores for IR3565 on T4240RDB W/AltiVec – 20A Step - ~100C (TBC) – 1.05V <10mV ripple 23mV overshoot TM External Use 59 Load Step with 4 cores for IR3565 on T4240RDB W/AltiVec – 6A Step - ~100C (temp to be confirmed) <10mV ripple 12mV undershoot TM External Use 60 Load Release with 4 cores for IR3565 on T4240RDB W/AltiVec – 6A Step - ~100C (temp to be confirmed) <10mV ripple 10mV overshoot TM External Use 61 Conclusion • • • We have load step current change data for 12 cores, 8 cores, and 4 cores for what we think is a worst case use case with and without AltiVec. We have di/dt measurements but they are taken with our decoupling caps included. As a result they are significantly lower than the value obtained from the current step changing in the measured time on die. In other words di/dt is reduced by on-die capacitance, package parasitics, and onboard decoupling. We recommend designing to our spec, i.e. −… place at least one decoupling capacitor at each VDD, OVDD, DVDD, GnVDD, and LVDD pin of the device. These capacitors should have a value of 0.1 μF. Only ceramic SMT (surface mount technology) capacitors should be used to minimize lead inductance, preferably 0402 or 0603 sizes. − As a guideline for customers and their power regulator vendors, Freescale recommends that these bulk capacitors be chosen to maintain the positive transient power surges to less than VID+50 mV (negative transient undershoot should comply with specification of VID-30mV) for current steps of up to 20A for 12 cores, 15A for 8 cores and 10A for 4 cores with a slew rate of 12 A/us. TM External Use 62 Conclusion • • • • • DC Voltage Specification communicates how VRM must respond to changes in load current demand. High-end VRMs can easily meet ±1% up to ~100 kHz. AC Voltage Specification communicates how PDS must damp higher frequency (100 kHz to 100 MHz?) dv/dt events caused by di/dt through inductive parasitics. dv/dt on a customer’s system is a function of Z and di/dt from T4240 and other sources. We are measuring ΔI on real silicon for several different use cases It is practical to achieve ΔV < 30mV TM External Use 63 References 1. 2. 3. 4. 5. 6. “Extended Adaptive Voltage Positioning (EAVP)”, Alex Waizman and Chee-Yee Chung, pp 65-68, 2000 “CPU Power Supply Impedance Profile Measurement Using FFT and Clock Gating”, Alex Waizman, pp 29-32, 2003 “Resonant Free Power Network Design Using Extended Adaptive Voltage Positioning (EVAP) Methodology”, Alex Waizman and CheeYee Chung, IEEE Transactions on Advanced Packaging, Vol. 24, No. 3, August 2001 “A Resonance-Free Power Delivery System Design Methodology Applying 3D Optimized Extended Adaptive Voltage Positioning”, Tao Xu and Brad Brim, pp 107-110, 2008 “Integrated Power Supply Frequency Domain Impedance Meter (IFDIM)”, Alex Waizman, pp 217-220, 2004 “Power Delivery Network (PDN) Tool User Guide”, Altera, March 2009 TM External Use 64 References High-Speed Digital Design: A Handbook of Black Magic, Howard Johnson and Martin Graham, Prentice-Hall, 1993 8. Frequency-Domain Characterization of Power Distribution Networks, Istvan Novak and Jason R. Miller, Artech House, 2007 9. “Power Supply Design for PowerPC™ Processors”, Gary Milliorn, Freescale AN2747, Rev. 1.1, 09/2004 10. “Power Supply Network Design for 3% Voltage Margin”, FTFENT-F0038, June 2012 7. TM External Use 65 Introducing The QorIQ LS2 Family Breakthrough, software-defined approach to advance the world’s new virtualized networks New, high-performance architecture built with ease-of-use in mind Groundbreaking, flexible architecture that abstracts hardware complexity and enables customers to focus their resources on innovation at the application level Optimized for software-defined networking applications Balanced integration of CPU performance with network I/O and C-programmable datapath acceleration that is right-sized (power/performance/cost) to deliver advanced SoC technology for the SDN era Extending the industry’s broadest portfolio of 64-bit multicore SoCs Built on the ARM® Cortex®-A57 architecture with integrated L2 switch enabling interconnect and peripherals to provide a complete system-on-chip solution TM External Use 66 QorIQ LS2 Family Key Features High performance cores with leading interconnect and memory bandwidth • SDN/NFV Switching • • 8x ARM Cortex-A57 cores, 2.0GHz, 4MB L2 cache, w Neon SIMD 1MB L3 platform cache w/ECC 2x 64b DDR4 up to 2.4GT/s A high performance datapath designed with software developers in mind Data Center • • Wireless Access • New datapath hardware and abstracted acceleration that is called via standard Linux objects 40 Gbps Packet processing performance with 20Gbps acceleration (crypto, Pattern Match/RegEx, Data Compression) Management complex provides all init/setup/teardown tasks Leading network I/O integration Unprecedented performance and ease of use for smarter, more capable networks TM External Use 67 • • • • 8x1/10GbE + 8x1G, MACSec on up to 4x 1/10GbE Integrated L2 switching capability for cost savings 4 PCIe Gen3 controllers, 1 with SR-IOV support 2 x SATA 3.0, 2 x USB 3.0 with PHY See the LS2 Family First in the Tech Lab! 4 new demos built on QorIQ LS2 processors: Performance Analysis Made Easy Leave the Packet Processing To Us Combining Ease of Use with Performance Tools for Every Step of Your Design TM External Use 68 TM www.Freescale.com © 2014 Freescale Semiconductor, Inc. | External Use
© Copyright 2024 ExpyDoc