A Fast-Locked All-Digital Phase-Locked Loop for Dynamic Frequency Scaling Dian Huang Ying Qiao Motivation CMOS IC technology keeps further scaling SoC benefits from All-Digital PLL (ADPLL) designs Dynamic frequency scaling in CPU Fast-locked phase-locked loop (PLL) for clock generation Tradeoffs between locking time and clock jitter We will focus on ADPLL design with bang-bang phase detector (BBPHD) Digitally controlled oscillator (DCO) frequency-search using algorithms with Successive-Approximation Registers (SAR) UC Berkeley - 2 ADPLL Architecture Conventional vs. Proposed ADPLL Architecture Conventional BBPHD ADPLL SAR Delay-Search Ref_clk MUX2 MUX1 DCDL β Ref_clkd BBPHD + α BBPHD 1 1 /SET D Q /CLR SAR Delay-Search DCO 0 + Q UserDefined Clock_out 0 0 1 MUX3 reset PI Controler activate 0 1 Frequency Serch Mode SAR Frequency Search Divide by 16 Proposed BBPHD ADPLL with SAR UC Berkeley - 3 Design Considerations Tradeoff exists between frequency phase locking time and output clock jitter performance 𝑡𝑙𝑜𝑐𝑘 = 𝜋 1 2𝜋 × 𝛽𝑘𝑣𝑐𝑜 − 𝑓𝑜𝑓𝑓 𝑓𝑟𝑒𝑓 𝑓𝑟𝑒𝑓 𝑓𝑟𝑒𝑓 – reference clock frequency 𝑓𝑜𝑓𝑓 – initial frequency error 𝛽𝑘𝑣𝑐𝑜 – system loop gain 𝛼 1 + 2𝐷 β - Proportional path gain 𝑞= 𝛽− 2 α – Integral path gain 𝑁𝑘𝑣𝑐𝑜 Δ𝑡𝑝𝑝 = 1 + 𝐷 4 𝛼 3 + 4 1 + 𝐷 3 𝛼 2 𝑞 + 8 1 + 𝐷 2 𝛼𝑞 2 + 8 1 + 𝐷 𝑞 3 2 4𝑞 UC Berkeley - 4 Fast-locking Techniques Simultaneous frequency and phase locking Yang, JSSC ’10 – adaptive loop gain Hung, Trans Circuit & Syst. ’11 – modified bang-bang algorithm Detangled frequency and phase locking Chung, JSSC ’11 – BSA frequency search + TDC phase locking UC Berkeley - 5 Proposed ADPLL Architecture SAR Delay-Search MUX2 MUX1 Ref_clk DCDL β Ref_clkd BBPHD + α BBPHD 1 1 /SET D Q /CLR SAR Delay-Search DCO 0 + Q UserDefined Clock_out 0 0 1 MUX3 reset PI Controler activate 0 1 Frequency Serch Mode SAR Frequency Search Divide by 16 UC Berkeley - 6 SAR-based Frequency Search Set DCO[MSB]=1 ref_clk=0 1->MSB activate DCO & divider Tref_clk>TDivider? ref_clk=1 0->MSB deactivate DCO & divider Reference clock Set DAC[MSB-1]=1 ref_clk=0 1->[MSB-1] activate DCO & divider Tref_clk>VDivider? 0->[MSB-1] Divider output Oscillator output 1->[LSB] Tref_clk>VDivider? ref_clk=1 deactivate DCO & divider Frequency serch done UC Berkeley - 7 0->[LSB] BBPHD UP signal SAR-based Delay Search Falling edge of divider output does not align with that of reference clock due to delay. Add extra delay to reference clock Once frequency search is done, CPU designer can choose whether input clock of PLL is reference clock or its delay version based on jitter and locking requirement. DELAYN[0] DELAYN[1] DELAYN[2] REF_CLK DELAY[0] DELAY[1] UC Berkeley - 8 DELAY[3] Vdd REF_CLKD DELAY[2] Locking Procedure 2 cycles delay-search, 10 cycles frequency-search for a 10 bit DCO. Remained frequency error and phase error are tiny. Locks at 790ns UC Berkeley - 9 Five Stage DCO DCO consists of 960 tri-state buffer: 64 row with each row has 15 buffers. Five extra tri-state buffer are used to drive each to node to either Vdd or ground during reset for fast start-up DCO Frequency Range: 0.42GHz ~ 12GHz UC Berkeley - 10 PI Controller With proposed frequency-search algorithm, small 𝛃 and 𝛂 can be chosen. 𝛃 needs to be several time larger than 𝛂 for stability, but want 𝛃 to be 1 or 2 to minimize the quantization noise. Integral path code increment by 1 only when it can increment by 4 UC Berkeley - 11 Performance Key Parameter Technology 45nm Locking Time 790ns Jitter RMS 1.32ps Jitter peak-to-peak 4.56ps Power 16mW@4.5GHz Achieves 790ns locking time while maintaining 1.32ps rms jitter. Peak-to-peak jitter is too optimistic. UC Berkeley - 12 Comparison [10] Hsu [8] Kim [9] Chung [2] Tierno This Work CMOS Process 0.18µm 0.13µm 65nm 45nm 45nm Core Area 0.14 mm2 0.2 mm2 0.07mm2 0.07 mm2 N/A Power 26.7mW@600MHz 16.5mW@1.35GHz 1.81mW@520MHz NA 16mW@4.5GHz Output Range 62~616MHz 0.3~1.4GHz 90~527MHz 0.8~12GHz 0.42~12GHz Locking Time NA 3.5µs NA *46 µs 790ns Jitter RMS 7.28ps @600MHz 3.7ps @1.35GHz 8.64ps @527MHz 1ps @5GHz 1.32ps @4.5GHz Jitter peak-to-peak 56ps @600MHz 32ps @ 1.35GHz NA NA 4.56ps @4.5GHz UC Berkeley - 13 Conclusion Proposed ADPLL realizes fast-locking without sacrificing jitter performance. 790ns locking time demonstrates that it is suitable to dynamic frequency scaling. Future work includes ADPLL with smooth frequency change so that CPU does not needs to stall its instructions. UC Berkeley - 14