Vol. 32, No. 4 Journal of SemiconductorsApril 2011
A 270-MHz to 1.5-GHz CMOS PLL clock generator with reconfigurable
multi-functions for FPGA?
Zhang Hui(张辉)1;2, Yang Haigang(杨海钢)1;?, Wang Yu(王瑜)1;2, Liu Fei(刘飞)1,
and Gao Tongqiang(高同强)1
1Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China
2Graduate University of the Chinese Academy of Sciences, Beijing 100049, China
Abstract: A PLL clock generator with reconfigurable multi-functions for FPGA design applications is presented.
This clock generator has two configurable operation modes to achieve clock multiplication and phase alignment
functions, respectively. The output clock signal has advanced clock shift ability such that the phase shift and duty
based on a novel quick start-up technique is proposed. A new delay partition method is also adopted to improve the
speed of the post-scale counter, which is used to realize the programmable phase shift and duty cycle. A prototype
consumption and the measured RMS jitter at 1 GHz are less than 18 mW and 9 ps, respectively. The settling time
is approximately 2 ?s.
Key words: PLL; clock generator; reconfigurable; VCO
DOI: 10.1088/1674-4926/32/4/045010EEACC: 2570
Phase-locked loops (PLLs) for clock multiplication and
phase alignment are widely used in high speed, large-scale
field programmable gate arrays (FPGAs) as a key component
responsible for the global clock distribution networks across
the chip. In addition to those common needs such as high fre-
quency output, wide frequency tuning range and low jitter, this
type of application usually demands flexible configuration ca-
pabilities in terms of some generation of discretionary output
clock frequency, duty cycle together with specified phase re-
lationship among input and output clock signals, which means
nearly all of the functions of the PLL should be programmable.
A majority of the reported PLL circuits have been focused
ficient consideration for multi-functional and reconfigurable
hand, the PLL clock generators used in the commercial FPGA
chips are capable of providing multi-functions to guarantee the
diversity of applications, however, usually they are unable to
match with the performance of those reported standalone PLL
chips in the aspects of frequency range, jitter and lock time,
etc. To maximize both sides of the requirements, this paper
tions, yet sustaining the satisfying performances in compari-
the clock generation mode and the skew compensation mode
to achieve clock multiplication and phase alignment functions,
respectively. Phase alignment can be realized by feeding back
the external output clock or a middle point clock off the clock
tree to align them with the input clockŒ3?. The output clock sig-
nal has advanced clock shift ability such that the phase shift
and duty cycle are programmable by an ingenious post-scale
counter design. All of these functions can be reconfigured by a
tions of phase shift and phase alignment accurately, a quick
start-up technique is used in the VCO to attain an accurate
phase relationship of the multi oscillating waveforms. A novel
delay partition method is also proposed to improve the speed
of the post-scale counter. Another reconfigurable design is im-
plementation of a variable current charge pump that is used to
achieve an approximately constant loop bandwidth over wide
loop configuration settings, since the variable loop bandwidth
2. System design consideration
The block diagram of the proposed clock generator is
shown in Fig. 1. The block N is the pre-divider and M is the
feedback divider. The post-scale counter can divide down the
high-frequency output of the VCO. So the whole system de-
rives the output signal to match
N ? postscale? fIN:
The divider N also eases the limit of the input frequency
The clock generator in Fig. 1 has three feedback loops
which can be switched over to each other by configuration to
implement different functions. The first one is the general PLL
loop to achieve clock multiplication. The second includes a
post-scale counter and a clock tree. A point in the middle of
* Project supported by the National High Technology Research and Development Program of China (No. 2008AA010701).
? Corresponding author. Email: email@example.com
Received 25 September 2010, revised manuscript received 10 November 2010
c ? 2011 Chinese Institute of Electronics
J. Semicond. 2011, 32(4)Zhang Hui et al.
Fig. 1. Block diagram of the proposed clock generator.
the clock tree is chosen and used as a feedback signal to the
PLL to reduce the skew of the clock tree for FPGA applica-
tions. The third includes a post-scale counter and an I/O port,
which is bidirectional to add the matched I/O delay in the feed-
back path so as to align the external output clock fOUTwith the
external input clock accurately. As a further consideration to
obtain the accurate phase alignment in the last two feedback
loops, the dividers M and N must be matched and the dummy
multiplexer (Mux) is placed before the divider N to make sure
the input signal and the feedback signal have matched paths
into the phase/frequency detector (PFD).
The VCO in Fig. 1 is a 4-stage differential ring oscilla-
tor which can provide four phases and their complements with
45ıphase shift per stage. By choosing a different phase from
the eight signals, the absolute phase resolution equals TVCO/8,
where TVCOis the period of the VCO oscillating signal. This is
the fine tuning of the phase. The coarse tuning implementing
the absolute phase step of TVCOis realized through the post-
An additional lock detect block is designed to indicate the
state of locked or unlocked. In addition to the state expression
of the PLL itself, the block output is always used as a sign or
enable signal in FPGA applications. It is indispensable in such
PLL clock generator design.
The division ratios of the divider M, N and the post-scale
counter are input through the serial interface and stored in the
internal registers. All other configurations are also operated
through the serial data interface and registers to control the
phase shift and duty cycle of the output signal, the function
modes, and the current of the charge pump according to the
division ratio of M.
3. Circuit implementation
Because of the requirement of phase shift and phase align-
ment, the oscillating waves must keep an accurate phase re-
lationship soon after the oscillation starts up; otherwise, the
configured phase information will be interrupted by the small
pulses during start-up of the VCO. So a quick start-up tech-
nique is proposed in the VCO design, one of the key circuits in
the proposed clock generator.
Figure 2 shows the presented VCO, which comprises a
Fig. 2. Circuit diagram of the VCO.
Fig. 3. 4-stage oscillator with pull up/down.
supply regulator and a 4-stage differential ring oscillator. The
inverting stage of the oscillator can achieve a full-swing os-
cillation. The regulator converts the control voltage Vcinto
the current Icand improves the power supply rejection ratio
(PSRR) effectivelyŒ5?. This reduces the shift of the center fre-
quency due to the supply voltage variation. In addition, it can
reduce the power supply bounces on the jitter performance be-
low the loop bandwidth of the regulatorŒ6?.
The principle of the proposed quick start-up technique is
pull-up/down circuit is placed at each node of the oscillator.
Two pull-up/down circuits at X1/Y1are available while an-
other six are dummies to make sure all stages are identical.
The pull-up/down enable signal is the complement of the VCO
enable signal. When the VCOis disabled, node X1will be high
level by weak pull-up and node Y1will be low level by weak
pull-down. Once the VCO is enabled, the pull-up/down is dis-
abled and the circuit will oscillate immediately, as shown in
J. Semicond. 2011, 32(4) Zhang Hui et al.
Fig. 4. Comparison of VCO start-up wave (a) without pull-up/down and (b) with pull-up/down.
Fig. 5. Block diagram of post-scale counter.
Fig. 4(b). Because the pull-up/down circuits equivalently input
an almost full-swing differential signal to the oscillator and the
process of amplifying small differential signal stage by stage
is no longer necessary. Such an oscillating wave does not in-
clude any small rise edge compared with the start-up wave of
in Figs. 4(a) and 4(b) are controlled by the same enable signal.
figuration in the post-scale counter following the VCO. So the
proposed technique avoids the failure of phase configuration.
The pull-up/down circuits take little power expense since the
pull-up/down is weak and the current Icis shut down when the
pull-up/down circuit is enabled.
3.2. Post-scale counter
Figure 5 shows the block diagram of the post-scale
counter, which implements a continuous division ratio of
1: 32,programmable phase shift anddutycycle.The operating
speed is limited by the function diversity of the counter. There-
fore, a novel delay partition method is proposed to improve the
speed of the counter.
The input of PI sets different initial values of the 5-bit syn-
chronous counter to implement different phase shiftsŒ8?. The
counter value is compared with the configuration codes of the
division ratio (DivR) and duty cycle (DuC) in the two logic
comparator blocks, respectively. The comparison results are
sent to control RS flip-flop, and then the programmable divi-
Fig. 6. A novel method to improve the speed of the post-scale counter.
sion and duty cycle are attained. The former comparison result
is also fed back to reset the 5-bit synchronous counter to guar-
antee the exact value of division ratio.
The feedback path delay is a bottleneck that limits the
speed of the post-scale counter. This paper takes a novel delay
partition method to solve the problem. If there is not a D flip-
flop in the feedback loop as a conventional counter, the period
of the input clock Tincould not become smaller than the de-
lay of the feedback path (from the input end to the reset end of
the 5-bit synchronous counter, including the setup time). This
design in Fig. 5 releases such a limit by adding a D flip-flop
in the feedback loop and triggering it by the clock signal gen-
erated by delaying the input clock. The reset signal can arrive
at the counter during the next period of the input clock. The
total feedback delay is partitioned into two parts, delay1 and
delay2, as shown in Fig. 6, where ?1and ?2are small delays
generated by the D flip-flop. The limits of TinbecomeŒ9?
delay2 C ?26 Tin? ı:
delay1 C ?16 TinC ı;
The delay ı tunes the two time windows to make Tinas
small as possible. Such a method improves the speed of the
post-scale counter effectively.
3.3. Charge pump
The open-loop transfer function of the PLL system in
Fig. 1 is given by
J. Semicond. 2011, 32(4)Zhang Hui et al.
Fig. 7. A variable current charge pump.
Fig. 8. Improvement of the loop bandwidth and phase margin by vari-
able current charge pump.
where KVCOis the gain of the VCO, F(s) is the transfer func-
tion of the low-pass filter (LPF), and IPis the charge and dis-
charge current of the charge pump. To get the wide frequency
tuning range, the loop division ratio M varies from 2 to 32,
which causes a big fluctuation in the loop bandwidth based on
Eq. (3). The variance of the loop bandwidth affects the phase
noise performance and loop stability. So as shown in Fig. 7,
a variable current charge pump is designed to reduce such an
effectŒ10?. The current of the charge pump can vary proportion-
ally to the value of M through a 3-bit control signal, so the
transfer function Hopen-loop(s) is almost invariable.
Figure 8 shows the simulated system loop bandwidths and
phase margins with the variable current charge pump and a
constant current charge pump as a comparison. Assuming that
the VCO gain is constant, the current of the charge pump
and the division ratio M are main factors of the loop band-
width and phase margin. As Figure 8 shows, the loop band-
width of the system with constant charge and discharge current
varies widely against the division ratio M (the CC-bandwidth
curve). Correspondingly, the loop bandwidth of the system
1.2 MHz (the VC-bandwidth curve). The division ratio 2–4 is
barely used while the measured tuning range is 270 MHz to
1.5 GHz. So the VC-bandwidth curve is acceptable although
Fig. 9. Block diagram of the lock detect circuit.
Fig. 10. Chip microphotograph of the proposed clock generator and
layout of the core circuits.
the curve is deteriorating in the division range 2–4. The phase
margin is insensitive to M due to the second-order low-pass
filter, but the variable current charge pump still improves the
stability of the system with the comparison of the two curves
CC-phasemargin and VC-phasemargin.
3.4. Lock detect circuit
The lock detect circuit gives a logic level output “locked”
to indicate whether the PLL is in the locked state. As shown
in Fig. 9, when the difference in the pulse widths between the
UP and DOWN signals is below one specific value, the PLL is
lock detect circuit. When the condition of lock is satisfied for
32 periods of the reference clock, the signal “locked” becomes
high level. But once the locked condition is not satisfied, the
signal “locked” becomes low level.
In the ideal PLL system, the difference in the pulse widths
between the UP and DOWN signals will turn to zero finally.
But a real PLL will always have such a difference due to many
charge currents, the variant of the process. To ease the affect of
these nonideal factors, the window of the locked state is recon-
figurable by tuning the delay in Fig. 9.
4. Measurement results
The proposed clock generator is fabricated in a 0.13-?m
CMOS process with a 1.5 V supply voltage. The chip mi-
crophotograph of the clock generator and the layout of the core
circuits are shown in Fig. 10. The PLL occupies an active area
of 0.36 mm2, including the serial data interface and its control
Figure 11(a) shows the test waveforms of the PLL enable
signal, the “locked” signal and the output clock signal after the
J. Semicond. 2011, 32(4) Zhang Hui et al.
Table 1. Performance comparison.
tuning range (MHz)
Power consumption (mW)
Power down feature (mW)
Active area (mm2/
Lock time (?s)
Multi function capability
15.7 @ 486 MHz
< 10:7 @ 486 MHz
N/A (< 300 ps pk–pk)
Programmable phase shift
and duty cycle, phase
0.13 ?m CMOS
18 @ 1 GHz
< 9 (< 58 ps pk–pk)
Programmable phase shift
and duty cycle, phase
0.13 ?m CMOS Process technologyN/A 0.13 ?m CMOS
Fig. 11. Measured (a) lock state signal versus PLL enable signal and
(b) output signal with programmable duty cycle and phase shift (de-
teriorated due to the bandwidth of the oscilloscope).
VCO starts to oscillate and the PLL starts the locking process.
After about 2 ?s, the locked state signal becomes high level,
which means that the PLL has a fast lock time of 2 ?s approx-
imately. When the enable signal is at a low level, the locked
state signal becomes low level, and the whole PLL is powered
down. The measured power consumption locked at 1 GHz is
about 13 mA, and less than 1 mA in the power-down state.
Figure 11(b) shows the output signal of the proposed clock
generator. The PLL is locked at 1 GHz. The division ratio of
the post-scale counter is 20, so the frequency of the output sig-
nal is 50 MHz. The duty cycle of the first waveform W1 is
30%, and the duty cycle of W2 is 50%. W3 is with the leading
phase of ?/2 compared with the two waveforms above. The
programmable duty cycle and phase shift are reconfigured by
the serial interface and registers.
According to the measurement results, the proposed PLL
shows a wide tuning range of 270 MHz to 1.5 GHz with the
Fig. 12. Measured jitter histogram.
RMS jitter of 9 ps at 1 GHz, as shown in Fig. 12. In fact, the
by the post-scale counter. So the measured result has the jitter
contribution from the post-scale counter. The real RMS jitter
should be less than 9 ps at 1 GHz. There are also other reasons
why the measured jitter is a little larger than some low-jitter
design. Firstly, it results from the high VCO gain to realize
the wide tuning range, which is proportional to the noise. Sec-
ondly, the measurement is carried out in real application en-
vironments. The proposed clock generator is integrated into a
FPGA chip. All of the test work is done with the application
circuit of the FPGA chip.
The skew compensation mode is also verified by feeding
back the output signal through the bidirectional I/O. The phase
phase difference between the two signals due to the mismatch
between the charge and discharge currents in the charge pump
and the mismatch between the two paths from the input and
output clock I/Os to the input terminals of the oscilloscope,
The measured performance of the clock generator is sum-
marized in Table 1 compared with the recently published clock
generators and the one in the Cyclone FPGA. In contrast with
the recently reported PLLsŒ11;12?, the reconfigurable multi-
J. Semicond. 2011, 32(4) Download full-text
functions are the highlights of this work besides the compar-
atively satisfying performances. Compared with the commer-
cial oneŒ13?, this work extended the frequency tuning range
58 ps (the peak–peak period jitter) and shortened the lock time
from 10 to 2 ?s.
Zhang Hui et al.
This paper presents a reconfigurable multi-function PLL
clock generator with a wide tuning range of 270 MHz to
1.5 GHz for FPGA applications. By configuration, the output
generator also has the function of phase alignment between the
clock tree or output clock and the input clock. A quick start-
up technique is proposed to guarantee accurate phase shift and
phase alignment. A novel delay partition method is proposed
to improve the speed of the post-scale counter. The total power
consumption at 1 GHz is 18 mW with a 1.5 V voltage supply,
while it is less than 1.5 mW in the power down state.
 Williams S, Thompson H, Hufford M, et al. An improved CMOS
ring oscillator PLL with less than 4 ps RMS accumulated jitter.
IEEE Custom Integrated Circuits Conference, 2004
 Sai A, Yamaji T, Itakura T. A low-jitter clock generator based
on ring oscillator with 1=f noise reduction technique for next-
generation mobile wireless terminals. IEEE Asian Solid-State
Circuits Conference, 2008: 425
 Von Kaenel V R. A high-speed, low-power clock generator for
a microprocessor application. IEEE J Solid-State Circuits, 1998,
 Lu L, Chen J, Yuan L. An 18-mW 1.175–2-GHz frequency syn-
thesizer with constant bandwidth for DVB-T tuners. IEEE Trans
Microw Theory Tech, 2009, 57(4): 928
 Lee C H, McClellan K, Choma Jr J. A supply-noise-insensitive
CMOS PLL with a voltage regulator using DC–DC capacitive
converter. IEEE J Solid-State Circuits, 2001, 36(10): 1453
 Fahim A M. Clock generator for SOC processors circuits and ar-
chitectures. London: Kluwer Academic Publishers, 2005
 Bautista D P, Aranda M L. A low power and high speed CMOS
voltage-controlled ring oscillator. IEEE International Sympo-
sium on Circuits and Systems, 2004, 4: 752
 Hwang I C, Kim C, Kang S M. A CMOS self-regulating VCO
with low supply sensitivity. IEEE J Solid-State Circuits, 2004,
 Zhang H, Yang H G, Zhang J, et al. High-speed programmable
counter design for PLL based on a delay partition technique.
IEEE International Symposium on Radio-Frequency Integration
Technology, 2009: 100
 Lee T H. The design of CMOS radio-frequency integrated cir-
cuits. 2nd ed. Cambridge University Press, 2004
 Ebuchi T, Komatsu Y, Okamoto T, et al. A 125–1250 MHz
process-independent adaptive bandwidth spread spectrum clock
generator with digital controlled self-calibration. IEEE J Solid-
State Circuits, 2009, 44(3): 763
 Lee J, Kim K, Lee J, et al. A 480-MHz to 1-GHz sub-picosecond
clock generator with a fast and accurate automatic frequency
calibration in 0.13-?m CMOS. IEEE Asian Solid-State Circuits
Conference, 2007: 67
 Cyclone Device Handbook, Volume 1, http://www.altera.com