Freescale Power Supply Requirements

Design to Tight
Power Supply Requirements
FTF-NET-F0036
Chuck Corley | DMTS
Mohit Kedia | Engineering Rotation Program
APR.2014
TM
External Use
Abstract: Design to Tight Power Supply Requirements
•
Session Length: 2 hours
• Freescale has begun specifying core supply voltages with ±30 mV
tolerances. Customers are accustomed to ±5% and are asking
questions about how to achieve this tighter requirement. This
presentation will discuss the specification and what customers need
to know for successful designs.
TM
External Use
1
Agenda
• Defining
the problem
−3% DC voltage requirement
−Time versus frequency domain
• VDD/PLAT Voltage Specification for T
(28nm) series parts
• Current step observations for T4240RDS
• Current step observations for T1040QDS
• Discussion of current slew rate
TM
External Use
2
Defining the Problem
Requirements
• Power Supply must supply a stable voltage reference
• Power Supply must distribute adequate current
Observations:
• Switching power supplies actually supply a digitally varying voltage (~500
KHz)
• Microprocessor’s current demand may vary as fast as core frequency
(~2GHz)
• Power Distribution Network (PDN) has resistance, capacitance,
inductance, mutual capacitance, and mutual inductance through PCB,
socket, vias, and capacitors.
• Changes in current at a particular frequency causes voltage changes at
that frequency across these impedances.
Problem:
•
Silicon vendors are tightening the voltage specifications while the
current continues to increase.
TM
External Use
3
SOCs incorporating the e6500 core in 28nm
e6500 core-based parts
T4240
T4160
B4860
T2080
T2081
B4420
E6500 cores/threads
12/24
8/16
4/8
4/8
4/8
2/4
Max core frequency (Hz)
1.66G
1.8G
1.66G
1.8G
1.8G
1.6G
Clusters/ L2 per cluster
3/2MB
2/2MB
1/2MB
1/2MB
1/2MB
1/2MB
DDR3/3L Memory controllers
3
2
2
1
1
1
CPC (L3) cache per controller
512KB
512KB
512KB
512KB
512KB
512KB
DMA controllers/channels
2/8
2/8
2/8
3/8
3/8
1/8
StarCore SC3900 FVP core
subsystems
NA
NA
6
NA
NA
2
NA
NA
3/2MB
NA
NA
1/2MB
1932 FCPBGA, 45
mm x 45
mm, 1mm
pitch
1932 FCPBGA, 45
mm x 45
mm, 1mm
pitch
1020 FCPBGA, 33
mm × 33
mm, 1mm
pitch
896 FCPBGA, 25
mm x 25
mm,
0.8mm
pitch
780 FCPBGA, 23
mm x 23
mm,
0.8mm
pitch
1020 FCPBGA, 33
mm × 33
mm, 1mm
pitch
StarCore Clusters/ L2 per cluster
Package
TM
External Use
4
Tight Core Voltage Specifications for 28nm
e6500 core-based parts
T4240
T4160
B4860
T2080/81
B4420
Core and platform supply Voltage - startup
1.05 V ± 30
mV
1.05 ± 30
mV
1.05 V ± 30
mV
1.025 ± 30
mV
1.05 V ± 30
mV
Core and platform supply Voltage – normal
operation
VID ± 30
mV
VID ± 30
mV
VID ± 30
mV
VID ± 30
mV
VID ± 30
mV
Operation at 1.1V is allowable for up to 25ms at
initial power on.
footnote 6
footnote 6
footnote 6
footnote 3
footnote 5
Voltage ID (VID) operating range is between
0.95V to 1.05V. Regulator selection should be
based on Vout range of at least 0.9V to 1.1V, with
resolution of 12.5mV or better.
0.9V but
changing
to 0.95
0.9V but
changing
to 0.95
footnote 1
footnote 7
0.9751.025
0.9V
Section
4.2.2
Section
4.2.2
S3.2.2:
Section
4.2.2
10A step
Footnote 4;
S3.2.2:
+50/-30
mV 1200MHz;
+100mV
transient;
20A step
it is recommended that the system designer place
at least one (0.1μF) decoupling capacitor at each
VDD, VDDC, CVDD, OnVDD, DVDD, EVDD,
GnVDD, and LnVDD pin of the device.
Section 4.3
Section 4.3
Section 3.3
Section 4.3
Section 3.3
Spec Rev
Rev G
Rev D
Rev H
Rev E/D
Rev C
…maintain the transient power surges to less
than +50 mV (negative transient undershoot
should comply with specification of VID-30mV) for
current steps of up to 20 A for 12 cores, 15A for 8
cores and 10A for 4 cores with a slew rate of 12
A/us.
TM
External Use
5
± 30 mV;
no step
spec’d
SOCs incorporating the e5500 core; some 28nm
e5500 core-based parts
P5020/10
P5040/21
T1040/42
T1020/22
2/1
4/2
4
2
Max core frequency (Hz)
2.0GHz
2.2GHz
1.4G
1.4G
L2 cache per core
256KB
512K
256KB
256KB
Memory controllers
2
2
1
1
1MB
1MB
256KB
256KB
2/4
2/4
2/8
2/8
1295 FCPBGA,
37.5 mm ×
37.5 mm,
1mm
1295 FCPBGA,
37.5 mm ×
37.5 mm,
1mm
45nm
45nm
E5500 cores
CPC (L3) cache per controller
DMA controllers/channels
Package
Technology
TM
External Use
6
780 FCPBGA, 23
mm
x 23 mm,
0.8mm
28nm
780 FCPBGA, 23
mm
x 23 mm,
0.8mm
28nm
Tight Core Voltage Specifications for e5500 & 28nm
e5500 core-based parts
P5020/10
P5040/21
1.0 ±
50mV(core
frequency =
1200 MHz)
1.1V ± 50mV
(core
frequency >
1200 MHz)
1.1 ± 50mV
(core
frequency ≤
2000 MHz)
1.2V ± 30mV
(core
frequency >
2000 MHz)
1.025 ± 30 mV
Operation at 1.1V is allowable for up to 25ms at initial power
on.
NA
NA
footnote 5
Voltage ID (VID) operating range is between 0.975V to
1.025V. Regulator selection should be based on Vout range of
at least 0.9V to 1.1V, with resolution of 12.5mV or better.
NA
NA
footnote 7
…maintain the transient power surges to less than +50 mV
(negative transient undershoot should comply with
specification of VID-30mV) for current steps of up to 20 A for
12 cores, 15A for 8 cores and 10A for 4 cores with a slew rate
of 12 A/us.
NA
NA
Section 4.2.2
10A step
…at least one (0.1μF) decoupling capacitor at each VDD,
VDDC, CVDD, OnVDD, DVDD, EVDD, GnVDD, and LnVDD
pin of the device.
Section 3.4
0.01 or 0.1μF*
Section 4.3
0.01 or 0.1μF*
Section 4.3
Spec Rev
Rev 0
Rev 0
Rev E
Core and platform supply Voltage - startup
Core and platform supply Voltage – normal operation
T1040/42/20/22
VID ± 30 mV
Better to use largest capacitance that will fit on footprint under the part.
TM
External Use
7
What is Voltage ID (VID) for 28nm Products?
•
•
A specific method of selecting the optimum voltage-level to
guarantee performance and power targets.
−
QorIQ device contains fuse block registers defining required voltage level. This EFUSE
definition is accessed through the Fuse Status Register (DCFG_FUSESR).
−
Customer system must use the VID to change the voltage regulators in the system in a
reliable and safe methodology.
QorIQ Chassis Architecture Specification, Generation 2 Revision 0.9
defines the general EFUSE definition.
−
A set of 24 efuses ([0-23]) that determine the speed bin and voltage requirements for the
device domains.
−
The range and steps are much more flexible than actually needed by manufacturing; only
the fuses necessary to provide the required voltages will be implemented.
TM
External Use
8
Voltage Specification Terms Better Defined
Tolerance VID +50mV / -30mV
VID or
DCSetPoint
vDD
Overshoot
Undershoot
Principal Silicon
Concern
Step-up
IDD
Switching
Ripple
Step-down
Load-Step
time
TM
External Use
9
Power Distribution System Theory – VRMs
•
Voltage Regulator Modules (VRMs) use feedback to hold a constant
supply voltage (up to the frequency of the inherent low pass filter).
• QorIQ parts allow feedback from the die voltage plane – SENSEVDD
• T4240QDS Intersil VRM (typical of most VRMs) advertises ±0.5%
Closed-loop System Accuracy Over Load, Line and Temperature [for
transients < 1/3 (to 1/5) of switching frequency – 350-500kHz].
+
-
LPF
Bulk
Caps
Vref
Planes
Bypass
Caps
From Intel VRM 11.1
TM
External Use
10
SENSEVDD_N
SENSEVDD_P
VID ± 30 mV
+12V
ST
VRM
Model PDN System
LF
SB
ESR
Bulk
Caps
Vref
TM
External Use
11
Mult
22 to
1000uF
caps
Bypass
Caps
VDD
~One Planes
0.1uF
per
pin
PKG
DIE
Power Distribution System Theory - Ripple
•
The most common meaning of ripple in electrical science is the small
unwanted residual periodic variation of the direct current (dc) output of a
power supply which has been derived from an alternating current (ac) source.
This ripple is due to incomplete suppression of the alternating waveform within
the power supply.
Voltage
VRIPPLE – P-P at Bulk Capacitors
PWM current spikes from +12V supply when ST conducts
time
TM
External Use
12
Power Distribution System Theory – AC Impedance
•
Inductance in the traces and vias (and socket pogo pins) create an
AC impedance (ZS) that causes dv/dt changes at the load with
varying di/dt.
• These dv/dt changes would “ride” on any DC voltage droop.
• Decoupling capacitors and capacitive plane layers are added to
reduce the AC impedance between VDD and GND.
+
-
SENSEVDD_N
SENSEVDD_P
ZS
+
-
+
DC
IL
VL
-
Vref
VS = 1.00 V
LPF
30ea
22uF
Bulk
Caps
TM
External Use
13
83ea
0.1uF Planes
VDD
DIE
Bypass
Caps
-
Reactive Elements in the PDN cause dv/dt
•
Well documented problem (see references slide)
• Silicon vendors are tightening the DC specifications at lower
supply voltages.
• Customers are demanding more information from silicon
vendors to aid in designing compliant power supplies (Power
Distribution Networks or PDNs).
TM
External Use
14
The PDN Problem in the Frequency Domain
?
Total Impedence VS Frequency (Log Scale)
1.0E+02
Board level
PDN design
1.0E+01
Z_total (Ohms)
On-chip,
package
Cut-off
Z_Pkg
Z_Die
Impedance (Ohms)
(Log Scale)
1.0E+00
VRM
1.0E-01
1.0E-02
P5020 50mV
1.0E-03
ΔV(f)/
ΔI(f)
=Ztarget
T4240 3%
1.0E-04
1.0E-05
1.E+04
Frequency (Hz)
TM
1.E+05
External Use
1.E+06
15
1.E+07
1.E+08
1.E+09
Power Distribution System Design
•
A common rule-of-thumb (in absence of better di/dt data from the
vendor) is to assume that Δi is 50% of max power/nominal voltage
(50% of 67W/1.0V = 34A). Δv for the same calculation would be the
AC variance allowed (30 mV for the T4240).
• Z = Δv/Δi = 0.88 mΩ
Z (Ω)
1.0000
0.1000
0.0100
Target Impedance
0.0010
0.0001
1
Hz
10
Hz
100
Hz
1
kHz
10
kHz
TM
External Use
16
100
kHz
1
10
MHz MHz
100
1
MHz GHz
Latest T4240 Voltage Specifications
Core and Platform Supply Voltage – VID (or 1.05V bootup) ± 30 mV
• Supply voltage measured at the voltage sense pins
• Combined DC and AC variance from nominal not to exceed ±30 mV except
for an overshoot of less than +50 mV during transients. Transient voltages
may result from current steps of up to 20A with slew rates of 12 A/us max.
WHAT THIS MEANS:
• Voltage regulator will boot up to 1.05V and then software should adjust VR to
VID to comply with power specification.
• Voltage regulator is assumed to hold the DC Set Point – as measured at
SENSE_VDD pins – to very small error (VID ±10 mV?)
• Switching voltage regulator ripple is suppressed to within a very small range
(VID ±20 mV?)
• Load step transients are suppressed by capacitance to VID +50mV and VID 30mV. Overshoot is judged to be harder to suppress than undershoot.
Overshoot is also less of a concern to the processor.
• Load step varies with program activity on the processor. Worst case on
T4240 is 20A for 23 virtual cores alternating between PH10/PH20 power
saving state and L1-resident, intensive computation with AltiVec.
•
TM
External Use
17
How to check for spec compliance?
•
Check VRMS value between SENSEVDD and SENSEGND with a
True-RMS DMM.
• Check ripple and load step transients between SENSEVDD and
SENSEGND with a differential probe and the oscilloscope set for
20MHz bandwidth offset and zoomed into a 20mV/DIV range…
• …while running your worst case application software.
(From suggestions by VRM suppliers.)
•
Power-up current-step transients should not be a problem because
the cores are released from boot hold-off one at a time – so we
don’t have to measure there.
• Power state changes after boot-up can be programmatically
controlled – so it should be possible to reduce Δt if necessary.
(Input from IC designers.)
TM
External Use
18
Voltage Observations
TM
External Use
19
Load Board pattern looping - SENSEVDD - avg
dhrystone power pattern from vector 3 to end of pattern – 25C -1800 MHz.
Sync at vector 369.
Average of 16 captures shows:
SENSEVDD AC: +21 mV / -28mV
dhrystone
complete
plat config & dma
syste
m
11
A
por
18
A
-70 mV undershoot
<10mV ripple
~28 mV undershoot
~28 mV overshoot
9A
1.15ms 1.4ms
2.15ms
2.43ms
SENSEVDD remains constant despite
increased current demand but spikes at steps
TM
External Use
20
Load Board pattern looping - VDD – avg DC
dhrystone power pattern from vector 3 to end of pattern – 25C -1800 MHz.
Sync at vector 369.
Average DC shows:
VDD: 1.023V +36 mV / -29mV
11
A
VDD adjusts upward18to compensate for increased current
demand
A
9A
TM
External Use
21
ΔV on the T4240RDS w/24 cores running Dhrystone on
Linux
1 Sample, 200MHz filter
This could be
caused by the
die, the board,
the electric
lights on the
bench, or the
atmosphere.
Not sure
which.
Probably not
the power
supply.
TM
External Use
22
ΔV on the T4240RDS w/24 cores running Dhrystone on
Linux
1 Sample, 20MHz filter
Event occurring every 4 ms
5 ms
TM
External Use
23
T4240RDS w/24 cores running Dhrystone on Linux
1 Sample, 20MHz filter, triggered by “the event”
23 mV overshoot
18 mV undershoot
10 µs occurs every 4 ms
TM
External Use
24
Believe this is
caused by a
current step on
the die.
But hard to tell
in Linux so will
develop our
own controlled
test case.
Creating a Current Step
TM
External Use
25
Core + Platform Current from data sheet for e6500 SOCs
e6500 core-based parts
T4240 r2
T4160 r2
T2080*
T2081*
Maximum 1867/800/1867/66 @ 105C
63A
53A
~27.3A
~26.6A
Thermal 1867/800/1867/66 @ 105C
54A
46A
~25.2A
~24.2A
Typical 1867/800/1867/66 @ 65C
37A
31A
~14.1A
~13.3A
Maximum 1667/733/1867/66 @ 105C
61A
50A
Thermal 1667/733/1867/66 @ 105C
52A
44A
Typical 1667/733/1867/66 @ 65C
34A
28A
Maximum 1500/667/1600/66 @ 105C
50A
40A
~21.2A
~20.5A
Thermal 1500/667/1600/66 @ 105C
42A
35A
~19.4A
~18.7A
Typical 1500/667/1600/66 @ 65C
30A
25A
~12.3A
~11.6A
16.7A
Maximum 1200/533/1600/66 @ 65C
Typical power assumes Dhrystone running with activity factor of 60% (on all cores) and is executing DMA
on the platform with 100% activity factor
Thermal power assumes Dhrystone running with activity factor of 60% (on all cores) and executing DMA
on the platform at 100% activity factor.
Maximum power assumes Dhrystone running with activity factor at 100% (on all cores) and is executing
DMA on the platform at 115% activity factor.
*1800/700/2133/66;
1533/600/1867/66;
1200/533/1600/66
TM
External Use
26
Core + Platform Current from data sheet for e5500 SOCs
e5500 core-based parts
P5020
P5010
P5040*
P5021
Maximum 2000/800/1333/66 @ 105C
27.3A
22.7A
40.0A
28.2A
Thermal 2000/800/1333/66 @ 105C
25.4A
21.8A
38.2A
27.3A
Typical 2000/800/1333/66 @ 65C
14.5A
12.7A
26.4A
19.1A
Maximum 1800/700/1300/66 @ 105C
25.4A
20.9A
38.2A
27.3A
Thermal 1800/700/1300/66 @ 105C
23.6A
20.0A
37.3A
26.4A
Typical 1800/700/1300/66 @ 65C
12.7A
10.9A
24.6A
18.2A
Maximum 1600/600/1200/66 @ 105C
20.9A
17.3A
~6.4A
Thermal 1600/600/1200/66 @ 105C
20.0A
17.3A
~6.0A
Typical 1600/600/1200/66 @ 65C
11.8A
10.9A
~4.2A
Maximum 1200/600/1200/66 @ 65C
18.0A
15.0A
5.8A
T1040**
Typical power assumes Dhrystone running with activity factor of 60% (on all cores) and is executing DMA
on the platform with 100% activity factor
Thermal power assumes Dhrystone running with activity factor of 60% (on all cores) and executing DMA
on the platform at 100% activity factor.
Maximum power assumes Dhrystone running with activity factor at 100% (on all cores) and is executing
DMA on the platform at 115% activity factor.
* 2000/700/1333/66;
1800/600/1200/66
TM
External Use
27
**1400/600/1600/66;
1200/500/1600/66
ΔI on the T4240 load board at 25C ambient
dhrystone power pattern from vector 3 to end of pattern – 25C -1800
MHz.
11 A
System
Por
9A
18 A
Dhrystone
Core
Boot
Plat config and dma running
1.15ms 1.4ms
TM
External Use
28
2.15ms
pattern stopped
What is the current demand of the die wrt time?
•
Static timing requires paths to finish inside 1 cycle. (most paths)
• For e5500 on P5020, the core was timed to 460ps – very small dt!
• More likely current can’t change dramatically in less than 4–6 core
clocks and that would be rare worst case.
% of Paths Still Toggling After Clock Edge at t=0 (blue)
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
50
100
150
TM
External Use
29
200
250
300
350
400
450
500
Worst Case AC Current Stimulus Goal
•
IPI vect = 1
IPI vect = 0
power
intensive
instructions
IPI vect = 1
decr
intrpt
IPI vect = 0
minimal power
wait instruction
decr
intrpt
Time
power
intensive
instructions
decr
intrpt
High
minimal power
wait instruction
Low
decr
intrpt
Current
Programmatically cause the actual die to represent a variable load
at controlled frequencies.
• Observing 23 cores to change from wait to intensive compute within
5 core clocks of one another (3ns at 1.67GHz)
Max frequency = platform clk/16
Voltage
GPIO4[3] signal for o’scope sync
CONFIRMED 23 THREADS IN PH10 DURING MINIMAL POWER (using TWAITSR0)!
TM
External Use
30
Wait for Interrupt Instruction
•
•
wait stops synchronous processor activity…until an asynchronous
interrupt …occurs.
The processor may use this to reduce power consumption. When an
interrupt occurs while the processor is waiting, its associated save/restore
register 0 will point to the instruction following the wait.
Current
•
•
•
Core frequency stays
constant.
Imax
HI
Power switches from HI to
LO and back on
decrementer interrupt.
Hypothesis: current is
constant for HI and LO at all
decrementer frequencies.
CmaxVf
CwaitVf
LO
Istatic
0
0
TM
External Use
31
f
1333
1500
1600
Power Management Fundamentals
•
CMOS Energy Consumption
− Dynamic
Energy Consumption
− Static Energy Consumption
TM
External Use
32
Fast Current Step for T4240
•
Inter-processor interrupt causes all 23 cores to switch from wait to
power intensive within 5 core clocks (3ns).
•
On-die current slew ~6000A/us
Normal Distribution of Time/Core for Change of State from
Wait to Run
35%
% of sample's
Percentage of sample's
30%
Std Dev: 0.77nS
Median: 139.8nS
Slope: ~6000A/us
25%
20%
15%
10%
5%
0%
0
25
50
TM
External Use
33
75
100
125
TIme in Nanoseconds
150
175
200
Slower Current Step for T4240
•
Inter-processor interrupt sent sequentially to each of 23 cores with
an intervening delay (3 instructions) caused a switch from wait to
power intensive within ~500 core clocks (300ns).
• On-die current slew ~60A/us
Normal Distribution of Time Taken/Core for
Change of State from Wait to Run
35
Percentage of sample's
30
% of sample's
25
Std Dev: 118.8nS
Median: 310.9nS
Slope: ~60a/us
20
15
10
5
0
0
50
100
150
200
250
TIme in Nanoseconds
TM
External Use
34
300
350
400
What is the Correct Use Case for Current Step?
TM
External Use
35
What use case for current step to max power?
•
•
•
•
•
DHRY: Dhrystone (entirely integer code)
FXSC6/12/15: Scalar fixed-point radix-two, in-place DFT 2n points* (all integer)
FPSC6/12/15: Scalar floating-point radix-two, in-place DFT 2n points (add
SPFP)
FXAV6/12/15: Vector fixed-point radix-two, in-place 2n points DFT (SIMD 8
shorts)
FPAV6/12/15: Vector floating-point radix-two, in-place DFT (SIMD 4 SPFP)
•
Core 0 to continuously control and report current from I2C
• Combinations of thread 1 through thread 23 running separate copies (AMP) of
above benchmarks.
−
−
−
•
3 clusters, 12 cores, 23 threads for T4240
2 clusters, 8 cores, 15 threads for T4160
1 cluster, 4 cores, 7 threads for T2080-like part
PCL10 cluster power-saving state for inactive clusters.
* Where n = 6/12/15
TM
External Use
36
Performance Metrics for Selection of Use Cases
BenchMark
IPC
CLKs
FP/i%
AV/i%
IL1M/i%
DL1M/i%
L2HIts
DHRY
0.62
1492
0
0
0
0.2%
40
FXSC6 (N=64)
0.18
7.37M
0
0
2.0%
0.0%
54.2K
FPSC6 (N=64)
0.18
7.66M
0.1%
0
2.0%
0.0%
56.4K
FXAV6 (N=64)
0.18
7.34M
0.0%
0.04%
2.0%
0.0%
53.8K
FPAV6 (N=64)
0.18
7.64M
0
0.04%
2.0%
0.0%
56.2K
FXSC12 (N=4K)
0.53
12.74M
0
0
0.4%
0.0%
169.7K
FPSC12 (N=4K)
0.38
11.55M
5.3%
0
0.7%
0.0%
150.3K
FXAV12(N=4K)
0.26
8.86M
0.0%
3.2%
1.2%
0.0%
97.0K
FPAV12 (N=4K)
0.30
9.76M
0
3.0%
1.1%
0.0%
125.0K
FXSC15(N=32K)
0.92
61.46M
0
0
0.2%
0.0%
1301.7K
FPSC15 (N=32K)
0.67
45.93M
7.7%
0
0.8%
0.0%
1096.7K
FXAV15 (N=32K)
0.52
20.74M
0
7.0%
0.6%
0.0%
504.0K
FPAV15 (N=32K)
0.60
27.32M
0
5.5%
1.2%
0.0%
841.4K
TM
External Use
37
T4240 Current Step Observations
TM
External Use
38
T4240 r1 Current Measurement – Dhrystone on 12 cores
•
T4240RDB with International Rectifier 3565A VR.
• Dhrystone: 46A to 59A step in ~3ns at 1.0V ~105C.
• Max undershoot and overshoot <30mV @ 1.05V
Current change in T4240 rev. 1 from wait to full power on
IPI interrupt 12 cores/24 threads Dhrystone
120
Current
110
Current (Ampere's)
Diode1
Diode2
55
100
Temp Controller
90
50
80
70
45
60
50
40
40
Frequency: 0.1Hz
35
30
0
100
200
300
400
Time (seconds)
TM
External Use
39
500
600
700
Temperature (C)
60
Consult
the HW
spec for
actual max
power
numbers!
T4240 r2 Current Measurement – Dhrystone on 12 cores
•
T4240RDB with International Rectifier 3565A VR.
• Dhrystone: 34A to 48A step in ~3ns at 1.0V ~105C.
• Max undershoot and overshoot <30mV @ 1.05V
Current change in T4240 rev. 2 from wait to full power on
IPI interrupt 12 cores/24 threads Dhrystone
50
Current (Ampere's)
130
Current
Diode1
120
Diode2
110
Temp Controller
45
100
90
40
80
35
70
60
30
50
25
Frequency: 0.1Hz
40
20
30
0
100
200
300
400
Time (seconds)
TM
External Use
40
500
600
700
Temperature (C)
55
Consult
the HW
spec for
actual max
power
numbers!
T4240 r1 Load Step – AltiVec on 12 cores
•
T4240RDB with International Rectifier 3565A VR.
• AltiVec FP FFT: 18A max step in ~3ns at 1.0V ~105C.
• Max undershoot and overshoot <30mV @ 1.05V
Current change in T4240 rev. 1 from wait to full power on
IPI interrupt 12 cores/24 threads FFT 4096 pts Altivec
Floating Point
65
120
Current
Current (Ampere's)
60
Diode2
100
Temp Controller
55
90
80
50
70
45
60
50
40
40
Frequency: 0.1Hz
35
30
0
100
200
300
400
Time (seconds)
TM
External Use
41
500
600
700
Temperature (C)
110
Diode1
Changing
the HW
spec from
30A step to
20A max!
T4020 r1 Current Measurement – Dhrystone on 8 cores
•
T4240RDB with International Rectifier 3565A VR.
• Dhrystone (integer) 45.5A to 53.5A step in ~3ns at 1.05V ~105C.
55
Current
120
53
Diode1
110
Diode2
51
Temp Controller
49
100
90
47
80
45
70
43
60
41
50
39
Frequency: 0.1Hz
37
40
35
30
0
100
200
300
400
Time (seconds)
TM
External Use
42
500
600
700
Temperature (C)
Current (Ampere's)
Current change in T4240 rev.1 from wait to full power on IPI
interrupt 8 cores/16 threads Dhrystone
T4020 r2 Current Measurement – Dhrystone on 8 cores
•
T4240RDB with International Rectifier 3565A VR.
• Dhrystone (integer) 34.5A to 43.5A step in ~3ns at 1.05V ~105C.
Current change in T4240 rev. 2 from wait to full power on
IPI interrupt 8 cores/16 threads Dhrystone
130
Current
Diode1
Current (Ampere's)
45
120
Diode2
110
Temp Controller
100
40
90
35
80
70
30
60
50
25
Frequency: 0.1Hz
40
20
30
0
100
200
300
400
Time (seconds)
TM
External Use
43
500
600
700
Temperature (C)
50
T4240 r1 Load Step – AltiVec on 8 cores
•
T4240RDB with International Rectifier 3565A VR.
• With AltiVec: 11A max step in ~3ns at 1.05V ~105C.
• Max undershoot and overshoot <15mV
Current change from wait to full power on IPI interrupt
8 cores/16 threads FFT 4096 pts Altivec Floating Point
120
Current
Current (Ampere's)
Diode1
110
Diode2
55
Temp Controller
100
90
50
80
70
45
60
50
40
Frequency: 0.1Hz
40
35
30
0
100
200
300
400
Time (seconds)
TM
External Use
44
500
600
700
Temperature (C)
60
T4020 r1 Current Measurement – Dhrystone on 4 cores
•
T4240RDB with International Rectifier 3565A VR.
• Dhrystone (integer) 45.5A to 49.5A step in ~3ns at 1.05V ~105C.
Current change in T4240 rev. 1 from wait to full power on IPI
interrupt 4 cores/8 threads Dhrystone
51
120
Current
Diode1
110
Diode2
Temp Controller
47
100
90
45
80
43
70
60
41
50
39
40
Frequency: 0.1Hz
37
30
0
100
200
300
400
Time (seconds)
TM
External Use
45
500
600
700
Temperature (C)
Current (Ampere's)
49
T4020 r2 Current Measurement – Dhrystone on 4 cores
•
T4240RDB with International Rectifier 3565A VR.
• Dhrystone (integer) 35A to 39.5A step in ~3ns at 1.05V ~105C.
Current change in T4240 rev. 2 from wait to full power on
IPI interrupt 4 cores/8 threads Dhrystone
45
130
40
Diode1
120
Diode2
110
Temp Controller
100
35
90
80
30
70
60
25
50
Frequency: 0.1Hz
40
20
30
0
100
200
300
400
Time (seconds)
TM
External Use
46
500
600
700
Temperature (C)
Current (Ampere's)
Current
T4240 r1 Load Step – AltiVec on 4 cores
•
T4240RDB with International Rectifier 3565A VR.
• With AltiVec: 5A max step in ~3ns at 1.05V ~105C.
• Max undershoot and overshoot <15mV
Current change in T4240 Rev.1 from wait to full power on IPI
interrupt 4 cores/8 threads FFT 4096 pts Altivec Floating
Point
Current
Diode1
Current (Ampere's)
49
110
Diode2
Temp Controller
47
120
100
90
45
80
43
70
60
41
50
39
40
Frequency: 0.1Hz
37
30
0
100
200
300
400
Time (seconds)
TM
External Use
47
500
600
700
Temperature (C)
51
Measured Step on T4240 RDB
Observed current step
for combined cores and platform
at ~100C, 1.66GHz, 1.05V
T4240
(24 cores)
Estimate
T4160
(16cores)
Estimate
T2080
(8 cores)
Dhrystone
14.5 A
9.0 A
4.0 A
Fixed-point DFT
18.0 A
11.0 A
5.5 A
Floating-point DFT
18.0 A
12.0 A
5.5 A
Vector Fixed-point DFT
18.0 A
12.0 A
5.5 A
Vector Floating-point DFT
18.0A
11.5 A
5.5 A
Dynamic current step is nearly constant over temperature and core
frequency.
TM
External Use
48
T1040 Current Step Observations
TM
External Use
49
T1040 Current Measurement – Dhrystone on 4 cores
•
T1040 with International Rectifier 3565A VR.
• Dhrystone: 3.4A to 4.45A step in ~3ns at 1.0V ~Room temp.
• Max undershoot and overshoot <30mV @ 1.05V
Current change in T1040 from wait to full power on IPI
interrupt 4 cores Dhrystone
5
45
Current
Diode1
43
4.6
41
4.4
39
4.2
37
4
35
3.8
33
3.6
31
3.4
29
Frequency: 0.1Hz
3.2
27
3
25
0
50
100
150
200
250
300
Time (seconds)
TM
External Use
50
350
400
450
Temperature (C)
Current (Ampere's)
4.8
T1040 Current Measurement – Dhrystone on 4 cores
•
T1040 with International Rectifier IR36021and IR3550.
• Dhrystone: 3.75A to 4.85A step in ~3ns at 1.0V ~85C.
• Max undershoot and overshoot <30mV @ 1.05V
Current change in T1040 from wait to full power on IPI
interrupt 4 cores Dhrystone
5.2
Current
100
Diode1
90
4.8
4.6
80
4.4
70
4.2
60
4
50
3.8
40
3.6
Temperature control
via heat gun!
3.4
30
3.2
20
0
10
20
30
40
50
60
Time (seconds)
TM
External Use
51
70
80
90
100
Temperature (C)
Current (Ampere's)
5
Discussion of current slew rate
TM
External Use
52
What does the on-die current step say about di/dt
externally?
•
•
•
On-die capacitance and package inductance reduces di/dt at VDD pins.
Recommended decoupling caps (0.1uF) on every power pin further
reduces it to what the bulk decoupling capacitors have to deal with (spec’d
12A/us).
From AN2747:
di/dt is a parameter of the silicon die that is essentially hidden by the
capacitive and inductive components of the die substrate, the die-local
bypass capacitors, the socket (if any) and other parasitics. Consequently,
the di/dt parameter used to design the power system is not the di/dt of the
processor die … but the filtered di/dt of the combined processor,
substrate-resident capacitors and the substrate itself. This di/dt is much
slower, as the current demands are initially supplied by the adjacent
transistors, die power traces, die substrate and local capacitors.
TM
External Use
53
Explaining the reduction of di/dt vs decoupling caps
(hypothetical example)
di/dt
15
A/us
di/dt
1350
A/us
TM
External Use
54
di/dt
3500
A/us
di/dt from the tester 110C
dhrystone power pattern from vector 369 (system ready) to vector 6000 (platform
configured and dma running) - biggest current bump 110C 1800 MHz
1.4 A/μs
18 A
TM
External Use
55
Is delta Voltage within spec?
TM
External Use
56
Transient Undershoot and Overshoot on T4240RDS with
18A load step (shown relative to earlier slide)
Spec
VID or
DCSetPoint
Tolerance VID +50mV / -30mV
vDD
Overshoot
Undershoot
Principal Silicon
Concern
Step-up
IOUT (20A/div)
IDD
Switching
Ripple
Step-down
Load-Step
time
TM
External Use
57
Load Step with 12 cores for IR3565 on T4240RDB
W/AltiVec – 20A Step - ~100C– 1.05V
<10mV ripple
18mV undershoot
TM
External Use
58
Load Release with 12 cores for IR3565 on T4240RDB
W/AltiVec – 20A Step - ~100C (TBC) – 1.05V
<10mV ripple
23mV overshoot
TM
External Use
59
Load Step with 4 cores for IR3565 on T4240RDB
W/AltiVec – 6A Step - ~100C
(temp to be confirmed)
<10mV ripple
12mV undershoot
TM
External Use
60
Load Release with 4 cores for IR3565 on T4240RDB
W/AltiVec – 6A Step - ~100C
(temp to be confirmed)
<10mV ripple
10mV overshoot
TM
External Use
61
Conclusion
•
•
•
We have load step current change data for 12 cores, 8 cores, and 4 cores
for what we think is a worst case use case with and without AltiVec.
We have di/dt measurements but they are taken with our decoupling caps
included. As a result they are significantly lower than the value obtained
from the current step changing in the measured time on die. In other
words di/dt is reduced by on-die capacitance, package parasitics, and onboard decoupling.
We recommend designing to our spec, i.e.
−…
place at least one decoupling capacitor at each VDD, OVDD, DVDD,
GnVDD, and LVDD pin of the device. These capacitors should have a value
of 0.1 μF. Only ceramic SMT (surface mount technology) capacitors should
be used to minimize lead inductance, preferably 0402 or 0603 sizes.
− As a guideline for customers and their power regulator vendors, Freescale
recommends that these bulk capacitors be chosen to maintain the positive
transient power surges to less than VID+50 mV (negative transient
undershoot should comply with specification of VID-30mV) for current steps
of up to 20A for 12 cores, 15A for 8 cores and 10A for 4 cores with a slew
rate of 12 A/us.
TM
External Use
62
Conclusion
•
•
•
•
•
DC Voltage Specification communicates how VRM must respond to
changes in load current demand. High-end VRMs can easily meet
±1% up to ~100 kHz.
AC Voltage Specification communicates how PDS must damp
higher frequency (100 kHz to 100 MHz?) dv/dt events caused by
di/dt through inductive parasitics.
dv/dt on a customer’s system is a function of Z and di/dt from
T4240 and other sources.
We are measuring ΔI on real silicon for several different use cases
It is practical to achieve ΔV < 30mV
TM
External Use
63
References
1.
2.
3.
4.
5.
6.
“Extended Adaptive Voltage Positioning (EAVP)”, Alex Waizman and
Chee-Yee Chung, pp 65-68, 2000
“CPU Power Supply Impedance Profile Measurement Using FFT and
Clock Gating”, Alex Waizman, pp 29-32, 2003
“Resonant Free Power Network Design Using Extended Adaptive
Voltage Positioning (EVAP) Methodology”, Alex Waizman and CheeYee Chung, IEEE Transactions on Advanced Packaging, Vol. 24, No. 3,
August 2001
“A Resonance-Free Power Delivery System Design Methodology
Applying 3D Optimized Extended Adaptive Voltage Positioning”, Tao Xu
and Brad Brim, pp 107-110, 2008
“Integrated Power Supply Frequency Domain Impedance Meter
(IFDIM)”, Alex Waizman, pp 217-220, 2004
“Power Delivery Network (PDN) Tool User Guide”, Altera, March 2009
TM
External Use
64
References
High-Speed Digital Design: A Handbook of Black Magic, Howard
Johnson and Martin Graham, Prentice-Hall, 1993
8. Frequency-Domain Characterization of Power Distribution
Networks, Istvan Novak and Jason R. Miller, Artech House, 2007
9. “Power Supply Design for PowerPC™ Processors”, Gary Milliorn,
Freescale AN2747, Rev. 1.1, 09/2004
10. “Power Supply Network Design for 3% Voltage Margin”, FTFENT-F0038, June 2012
7.
TM
External Use
65
Introducing The
QorIQ LS2 Family
Breakthrough,
software-defined
approach to advance
the world’s new
virtualized networks
New, high-performance architecture built with ease-of-use in mind
Groundbreaking, flexible architecture that abstracts hardware complexity and
enables customers to focus their resources on innovation at the application level
Optimized for software-defined networking applications
Balanced integration of CPU performance with network I/O and C-programmable
datapath acceleration that is right-sized (power/performance/cost) to deliver
advanced SoC technology for the SDN era
Extending the industry’s broadest portfolio of 64-bit multicore SoCs
Built on the ARM® Cortex®-A57 architecture with integrated L2 switch enabling
interconnect and peripherals to provide a complete system-on-chip solution
TM
External Use
66
QorIQ LS2 Family
Key Features
High performance cores with leading
interconnect and memory bandwidth
•
SDN/NFV
Switching
•
•
8x ARM Cortex-A57 cores, 2.0GHz, 4MB L2
cache, w Neon SIMD
1MB L3 platform cache w/ECC
2x 64b DDR4 up to 2.4GT/s
A high performance datapath designed
with software developers in mind
Data
Center
•
•
Wireless
Access
•
New datapath hardware and abstracted
acceleration that is called via standard Linux
objects
40 Gbps Packet processing performance with
20Gbps acceleration (crypto, Pattern
Match/RegEx, Data Compression)
Management complex provides all
init/setup/teardown tasks
Leading network I/O integration
Unprecedented performance and
ease of use for smarter, more
capable networks
TM
External Use
67
•
•
•
•
8x1/10GbE + 8x1G, MACSec on up to 4x 1/10GbE
Integrated L2 switching capability for cost savings
4 PCIe Gen3 controllers, 1 with SR-IOV support
2 x SATA 3.0, 2 x USB 3.0 with PHY
See the LS2 Family First in the Tech Lab!
4 new demos built on QorIQ LS2 processors:
Performance Analysis Made Easy
Leave the Packet Processing To Us
Combining Ease of Use with Performance
Tools for Every Step of Your Design
TM
External Use
68
TM
www.Freescale.com
© 2014 Freescale Semiconductor, Inc. | External Use