Content uploaded by Ashraful Islam
Author content
All content in this area was uploaded by Ashraful Islam on Mar 20, 2017
Content may be subject to copyright.
A New Synchronous circuit for Elastic
Pipeline Architecture
Md. Ashraful Islam
Baysand Inc
Dhaka, Bangladesh
ash_apee@yahoo.com
Md. Yeasin Arafath
Baysand Inc
Dhaka, Bangladesh
ara2ras@yahoo.com
Mamun-Ur-Rashid Khandker
Dept. of Applied Physics and Electronic Engg.
Rajshahi University
Rajshahi, Bangladesh
khandker@ru.ac.bd
Abstract— Pipelining is a method of circuit design which
implements pipelining in logic using of intermediate latches or
registers. A simple pipeline circuit for Synchronous design is
presented. The main features of this pipeline architecture are the
implementation of elastic communication channels and easily
implementable in both ASIC (Application Specific Integrated
Circuit) and FPGA independent of the EDA (Electronic Design
Automation) design tool. The pipeline uses edge triggered flip-
flop in its datapath and controller to generate the clock enable
signal for the datapath. With this approach, elasticity can be
introduced at the level of functional units (e.g. ALUs, memories).
The formal specification of the protocol is defined and
implementation of elasticity in the pipeline is discussed.
Index Terms—Elastic Pipeline, FPGA, ASIC, Synchronous
Interlocked.
I. INTRODUCTION
Early concept of interlocked pipelines was for
asynchronous pipeline designs ([1] [2]). Asynchronous
pipelines [1] have several properties that have the potential
benefit to circuit design. The most attractive feature is the
ability to only activate a pipeline stage in the presence of valid
data, and local control decisions for pipeline interlocking.
Asynchronous pipelines inherently provide elasticity which
means a variable number of data item scan appear in the
pipeline at any time [5]. If there is no congestion and data
items are injected at wide intervals, data items are widely
spaced in the pipeline and travel rapidly through. If input rates
are higher, spacing becomes tighter between items. In the
extreme case, with a slow or stalled output environment, data
items become bunched or stalled at close intervals. In all cases,
input data items are processed as they arrive, even with an
unknown or irregular arrival rate; there is no wait for a clock
edge. Hence, the inter-token spacing and the throughput rate
are determined dynamically [6].
But the Poor CAD support hinders wide acceptance of
asynchronous methodologies. Moreover most of the
asynchronous pipelines proposed by the early researchers are
fabrication technology dependent.
Latency insensitive schemes proposed in [7] separate the
communication channels from computational units. This
synchronous system uses relay stations at the interfaces
between computational units [8] [9].
Synchronous interlocked pipelines [10] were proposed to
achieve fine grained interlocking at the level of stages. Each
stage is interlocked with its neighboring stages in both forward
and backward direction. As far we know this was the first
interlocked synchronous pipeline technique. But the whole
design was based on latch. Their proposed pipeline circuit is
shown in Fig. 1. Though latches provide some facility to clock
skew and allow slack passing and time borrowing between
pipeline stages, latches are not traditionally used in ASICs and
FPGA designs. Synthesis tools have provided only limited
support for latch based designs. Also latches are not friendly
with DFT tools [3]. For scan testing, they are often replaced by
a flip-flop compatible with the scan-test shift-register. Under
these conditions, a flip-flop would actually be less expensive
than a latch. Another problem is that latch based design are
more difficult to verify. As a latch’s output must be valid at the
clock edge causing it to go opaque that can be used as a hard
clock edge boundary for formal verification tools.
Unfortunately, FPGA and ASIC verification tools don’t
support this methodology. Here we propose a similar
interlocked synchronous pipeline using edge triggered flip-
flops to synchronizing the events on the edge of a clock. For
power reduction we have also implied the clock gating
technique for data channel. Our clock gating circuit is
supported by most of the EDA (Electronic Design Automation)
tools [4].
Fig. 1. Two-phase clocked interlocked synchronous pipeline implementing
forward and backward interlock [10]
International Conference on Materials, Electronics & Information Engineering, ICMEIE-2015
05-06 June, 2015, Faculty of Engineering, University of Rajshahi, Bangladesh
www.ru.ac.bd/icmeie2015/proceedings/
ISBN 978-984-33-8940--4
II. ELASTIC SYNCHRONOUS INTERLOCKED PIPELINE
A. Specification of the Handshake Protocol
Pipeline interlocking is a technique to control the flow of
data through pipelined systems. Pipeline interlocking is
typically achieved through the use of handshake techniques. To
do so we have adopted the same handshaking protocol
described in [10]. This protocol defines the propagation of data
in forward direction with valid bits. The valid bit represents the
data in the channel is valid for the associated pipeline stage.
Stall bits are propagated in the backward direction of the
pipeline. These stall bits indicate when the pipeline must halt.
That is the destination module cannot receive further data until
the stall bit is de-asserted. The valid/stall handshake interface
model is shown in Fig. 2.
A new data item is read into a data register only when the
data item is valid and the register is not stalled. A data register
does not stall until it is filled with valid data. A stall condition
does not need to propagate backward when there is no valid
data (absence of valid data indicates a hole) to stall in a stage.
Therefore, the whole pipeline system would not be stalled
unless the pipeline completely fills up. This protocol improves
throughput of the system in the presence of stalls as the valid
data is continue to fill in forward direction until all holes have
been filled. The behavior of handshake protocol is illustrated in
a state diagram in Fig 3.
Fig. 2. Interface of elastic synchronous interlocked pipeline
The possible states in a pipeline stage are:
Idle: indicating that the stage has not any valid data.
Valid: indicating that the stage has valid data
received from its previous stage and ready to
deliver to its next stage.
Stalled: indicating that the stage is not ready to
receive new data from its previous stage and
remains in that state until it ready to receive new
data again.
Fig. 3. Possible transition of valid/stall handshake protocol
B. Implementation of elastic synchronous interlocked pipeline
Our proposed pipeline circuit is shown in Fig. 4.Valid bit
propagates in forward direction (left to right) in every positive
edge of clock. Stall bit is allowed to propagate during the low
period of the clock. The data is captured by the gated clock.
The AND function between the stall input and valid output of
the pipeline ensures that holes in the pipeline are filled in by
disabling the stall signal when there is no valid data present.
Two latches are used in each stage for clock gating. These
latches are enabled at low level of clock. It helps to generate a
stable clock gating enable signals during high level of clock.
This ensures the glitch free gated clock which is recommended
by the most of the EDA tools. Data in the input channel is
captured during the positive edge of gated clock. Block
diagram of the pipeline circuit for simulation is in Fig. 5. The
simulation of the pipeline circuit is shown in Fig. 6.
C. Clock Constraint
To calculate maximum clock frequency we have to find out
the worst path delay. The worst case path for an N-stage
pipelined circuit is shown in Fig. 7. We have the following
assumptions: at any stage K, (a) the valid_out [K] arrives at the
input of AND gate during the clock high period, b) the
stall_out[K] can only change during the clock low period.
It is notable that valid_out signal is generated within the
stage, but stall_in signal comes from the next stage. However,
for Stage[N], the stall_in[N] signal comes from external
source; for example, memory.
According to our assumption, the clock high period has the
following constraint TCLK_H≥tC_Q; where tC_Q is the clock to
output delay of the flip-flop. If tPD is the propagation delay of
each stage (from Fig. 7 which is the combination of AND gate
delay and latch delay), then the clock low time would be
TCLK_L ≥ K*tPD. Suppose that the maximum logic delay
between consecutive stages is tLOGIC then the clock period for
conventional pipeline architecture [11] would be T_CLK ≥
tLOGIC + tSU_F + tC_Q + tSKEW; where tSU_F is the setup time for the
flip-flop (which is in the datapath) and tSKEW is the maximum
clock skew between pipeline stages. However, for the proposed
pipeline system the minimum clock period, TCLK will be the
largest between TCLK_H + TCLK_L and T_CLK; That is TCLK ≥
MAX (T_CLK, TCLK_H + TCLK_L).
International Conference on Materials, Electronics & Information Engineering, ICMEIE-2015
05-06 June, 2015, Faculty of Engineering, University of Rajshahi, Bangladesh
www.ru.ac.bd/icmeie2015/proceedings/
ISBN 978-984-33-8940--4
Fig. 4. Proposed pipeline circuit
Fig. 5. Block diagram of the pipeline circuit for simulation
Fig. 6. Simulation of the pipeline circuit
International Conference on Materials, Electronics & Information Engineering, ICMEIE-2015
05-06 June, 2015, Faculty of Engineering, University of Rajshahi, Bangladesh
www.ru.ac.bd/icmeie2015/proceedings/
ISBN 978-984-33-8940--4
Fig. 7. Worst case path
III. CONCLUSIONS
A new scheme for synchronous elastic pipeline design has
been presented. An efficient flip-flop based implementation
combines the efficiency of synchronous implementations and
reduced power consumption by clock gating. The proposed
scheme can be applied on different levels of system like, in the
white-box (e.g. microprocessor design) and black-box
scenarios (SoC IPs). A drawback in this design is long
combinational path that exists in the backward direction i.e.
stall propagation path. This drawback can be overcome by
cutting down the path by inserting buffer.
REFERENCES
[1] I.E. Sutherland, “Micropipelines,” Communications of the
ACM,vol. 32, no. 6,pp. 720-738, June 1989.
[2] E.J. McLellan, “Reducing stall delay in pipelined computer
system using queue between pipeline stages,”Digital Equipment
Corporation, U.S. patent 5325495 (1994).
[3] D.Chinnery, K.Keutzer, J. Sanghavi, E. Killian and K. Sheth,
“Automatic Replacement of Flip-Flops by Latches in ASICs,”
Closing the Gap Between ASIC & Custom, Kluwer Academic
Publishers,2002, pp. 187-208.
[4] Synopsys. [Online]. Available: https://www.synopsys.com/
COMPANY/PUBLICATIONS/SYNOPSYSINSIGHT/Pages/Ar
t2-reduceadvsynthesis-IssQ4-11.aspx
[5] D.E. Muller, “Asynchronous Logics and Application to
Information Processing,”Proc. Symp. the Application of
Switching Theory to Space Technology, Stanford University
Press, 1963, pp. 289-297.
[6] S.M. Nowick and M. Singh, “High-Performance Asynchronous
Pipelines: An Overview,” Design & Test of Computers, IEEE ,
vol.28, no.5, pp.8,22, Sept.-Oct. 2011.
[7] L. Carloni, K.L. McMillan and A.L. Sangiovanni- Vincentelli,
“Theory of latency-insensitive design, ”IEEE Transactions on
Computer-Aided Design, vol. 20, no. 9, pp.1059–1076, Sept.
2001.
[8] L.P. Carloni and A.L. Sangiovanni-Vincentelli, “Coping with
latency in SoCdesign,”IEEE Micro, Special Issue on Systems on
Chip, vol. 22, no. 5, pp.12, Octo. 2002.
[9] Tiberiu Chelcea and Steven M. Nowick, “Robust interfaces for
mixed-timing systems with application to latency- insensitive
protocols,” Proc. ACM/IEEE Design Automation Conference,
June 2001.
[10] Hans M. Jacobson, Prabhakar N. Kudva, Pradip Bose, Peter
W.Cook, Stanley E. Schuster, Eric G. Mercer, and Chris J.
Myers, “Synchronous interlocked pipelines,” Proc. International
Symposium on Advanced Research in Asynchronous Circuits
and Systems, pp. 3–12, April 2002.
[11] Wayne P. Burleson, Maciej Ciesielski, Fabian Klass, and
Wentai Liu, “Wave-Pipelining: A Tutorial and Research
Survey” IEEE Transactions on Very Large Scale Integration
(VLSI) Systems, vol. 6, no. 3, September 1998.
International Conference on Materials, Electronics & Information Engineering, ICMEIE-2015
05-06 June, 2015, Faculty of Engineering, University of Rajshahi, Bangladesh
www.ru.ac.bd/icmeie2015/proceedings/
ISBN 978-984-33-8940--4