📄 ch13.7.htm
字号:
clock-to-output path) from the pipeline. The longest <SPAN CLASS="Definition">
exit delay </SPAN>
<A NAME="marker=118255">
</A>
(<A NAME="marker=118302">
</A>
clock-to-output delay) is 11.95 ns.</LI>
</OL>
<P CLASS="Body">
<A NAME="pgfId=81915">
</A>
By pipelining the design we added three clock periods of latency, but we increased the estimated operating speed. The longest prelayout critical path is now an exit delay, approximately 12 ns—more than doubling the maximum operating frequency. Next, we route the registered version of the design. The Actel software informs us that the postroute maximum stage delay is 11.3 ns (close to the preroute estimate of 9.99 ns). To check this figure we can perform another timing analysis. This time we shall measure the stage delays (the start points are all clock pins, and the end points are all inputs to sequential cells, in our case the D input to a D flip-flop). We need to define the <SPAN CLASS="Definition">
sets</SPAN>
<A NAME="marker=93385">
</A>
of nodes at which to start and end the timing analysis (similar to the path clusters we used to specify timing constraints in logic synthesis). In the Actel timing analyzer we can use predefined sets <SPAN CLASS="BodyComputer">
'clock'</SPAN>
(flip-flop clock pins) and <SPAN CLASS="BodyComputer">
'gated'</SPAN>
(flip-flop inputs) as follows:</P>
<P CLASS="ComputerFirst">
<A NAME="pgfId=38582">
</A>
timer> startset clock</P>
<P CLASS="Computer">
<A NAME="pgfId=38585">
</A>
timer> endset gated</P>
<P CLASS="Computer">
<A NAME="pgfId=38588">
</A>
timer> longest</P>
<P CLASS="Computer">
<A NAME="pgfId=38589">
</A>
1st longest path to all endpins</P>
<P CLASS="Computer">
<A NAME="pgfId=38590">
</A>
Rank Total Start pin First Net End Net End pin</P>
<P CLASS="Computer">
<A NAME="pgfId=38591">
</A>
0 11.3 a_r_ff_b2:CLK a_r_2_ block_0_OUT1 sel_r_ff:D</P>
<P CLASS="Computer">
<A NAME="pgfId=38592">
</A>
1 6.6 sel_r_ff:CLK sel_r DEF_NET_50 outp_ff_b0:D</P>
<P CLASS="ComputerLast">
<A NAME="pgfId=38593">
</A>
... 8 similar lines omitted ...</P>
<P CLASS="BodyAfterHead">
<A NAME="pgfId=38602">
</A>
We could try to reduce the long stage delay (11.3 ns), but we have already seen from the preroute timing estimates that an exit delay may be the critical path. Next, we check some other important timing parameters.</P>
<DIV>
<H2 CLASS="Heading2">
<A NAME="pgfId=38606">
</A>
13.7.1 <A NAME="26379">
</A>
Hold Time</H2>
<P CLASS="BodyAfterHead">
<A NAME="pgfId=38945">
</A>
Hold-time problems can occur if there is clock skew between adjacent flip-flops, for example. We first need to check for the shortest exit delays using the same sets that we used to check stage delays,</P>
<P CLASS="ComputerFirst">
<A NAME="pgfId=38610">
</A>
timer> shortest</P>
<P CLASS="Computer">
<A NAME="pgfId=38611">
</A>
1st shortest path to all endpins</P>
<P CLASS="Computer">
<A NAME="pgfId=38612">
</A>
Rank Total Start pin First Net End Net End pin</P>
<P CLASS="Computer">
<A NAME="pgfId=38613">
</A>
0 4.0 b_rr_ff_b1:CLK b_rr_1_ DEF_NET_48 outp_ff_b1:D</P>
<P CLASS="Computer">
<A NAME="pgfId=38614">
</A>
1 4.1 a_rr_ff_b2:CLK a_rr_2_ DEF_NET_46 outp_ff_b2:D</P>
<P CLASS="ComputerLast">
<A NAME="pgfId=38615">
</A>
... 8 similar lines omitted ...</P>
<P CLASS="Body">
<A NAME="pgfId=38624">
</A>
The shortest path delay, 4 ns, is between the clock input of a D flip-flop with instance name <SPAN CLASS="BodyComputer">
b_rr_ff_b1</SPAN>
(call this <SPAN CLASS="BodyComputer">
X</SPAN>
) and the D input of flip-flop instance name <SPAN CLASS="BodyComputer">
outp_ff_b1</SPAN>
(<SPAN CLASS="BodyComputer">
Y</SPAN>
). Due to clock skew, the clock signal may not arrive at both flip-flops simultaneously. Suppose the clock arrives at flip-flop <SPAN CLASS="BodyComputer">
Y</SPAN>
3 ns earlier than at flip-flop <SPAN CLASS="BodyComputer">
X</SPAN>
. The D input to flip-flop <SPAN CLASS="BodyComputer">
Y</SPAN>
is only stable for (4 – 3) = 1 ns after the clock edge. To check for hold-time violations we thus need to find the clock skew corresponding to each clock-to-D path. This is tedious and normally timing-analysis tools check hold-time requirements automatically, but we shall show the steps to illustrate the process. </P>
</DIV>
<DIV>
<H2 CLASS="Heading2">
<A NAME="pgfId=38608">
</A>
13.7.2 <A NAME="33634">
</A>
Entry Delay</H2>
<P CLASS="BodyAfterHead">
<A NAME="pgfId=38946">
</A>
Before we can measure clock skew, we need to analyze the entry delays, including the clock tree. The synthesis tools automatically add I/O pads and the clock cells. This means that extra nodes are automatically added to the netlist with automatically generated names. The EDIF conversion tools may then modify these names. Before we can perform an analysis of entry delays and the clock network delay, we need to find the input node names. By looking for the EDIF <SPAN CLASS="BodyComputer">
'rename'</SPAN>
construct in the EDIF netlist we can associate the input and output node names in the behavioral Verilog model, <SPAN CLASS="BodyComputer">
comp_mux_rrr</SPAN>
, and the EDIF names,</P>
<P CLASS="ComputerFirst">
<A NAME="pgfId=82073">
</A>
piron% grep rename comp_mux_rrr_o.edn</P>
<P CLASS="Computer">
<A NAME="pgfId=82074">
</A>
(port (rename a_2_ "a[2]") (direction INPUT))</P>
<P CLASS="Computer">
<A NAME="pgfId=82075">
</A>
... 8 similar lines renaming ports omitted ...</P>
<P CLASS="Computer">
<A NAME="pgfId=82118">
</A>
(net (rename a_rr_0_ "a_rr[0]") (joined</P>
<P CLASS="Computer">
<A NAME="pgfId=82119">
</A>
... 9 similar lines renaming nets omitted ...</P>
<P CLASS="ComputerLast">
<A NAME="pgfId=82128">
</A>
piron%</P>
<P CLASS="Body">
<A NAME="pgfId=38631">
</A>
Thus, for example, the EDIF conversion program has renamed input port <SPAN CLASS="BodyComputer">
a[2]</SPAN>
to <SPAN CLASS="BodyComputer">
a_2_</SPAN>
because the design tools do not like the Verilog bus notation using square brackets. Next we find the connections between the ports and the added I/O cells by looking for <SPAN CLASS="BodyComputer">
'PAD'</SPAN>
in the Actel format netlist, which indicates a connection to a pad and the pins of the chip, as follows:</P>
<P CLASS="ComputerFirst">
<A NAME="pgfId=38634">
</A>
piron% grep PAD comp_mux_rrr_o.adl</P>
<P CLASS="Computer">
<A NAME="pgfId=38635">
</A>
NET DEF_NET_148; outp_2_, OUTBUF_31:PAD.</P>
<P CLASS="Computer">
<A NAME="pgfId=38636">
</A>
NET DEF_NET_151; outp_1_, OUTBUF_32:PAD.</P>
<P CLASS="Computer">
<A NAME="pgfId=38637">
</A>
NET DEF_NET_154; outp_0_, OUTBUF_33:PAD.</P>
<P CLASS="Computer">
<A NAME="pgfId=38638">
</A>
NET DEF_NET_127; a_2_, INBUF_24:PAD.</P>
<P CLASS="Computer">
<A NAME="pgfId=38639">
</A>
NET DEF_NET_130; a_1_, INBUF_25:PAD.</P>
<P CLASS="Computer">
<A NAME="pgfId=38640">
</A>
NET DEF_NET_133; a_0_, INBUF_26:PAD.</P>
<P CLASS="Computer">
<A NAME="pgfId=38641">
</A>
NET DEF_NET_136; b_2_, INBUF_27:PAD.</P>
<P CLASS="Computer">
<A NAME="pgfId=38642">
</A>
NET DEF_NET_139; b_1_, INBUF_28:PAD.</P>
<P CLASS="Computer">
<A NAME="pgfId=38643">
</A>
NET DEF_NET_142; b_0_, INBUF_29:PAD.</P>
<P CLASS="Computer">
<A NAME="pgfId=38644">
</A>
NET DEF_NET_145; clock, CLKBUF_30:PAD.</P>
<P CLASS="ComputerLast">
<A NAME="pgfId=38632">
</A>
piron%</P>
<P CLASS="Body">
<A NAME="pgfId=74842">
</A>
This tells us, for example, that the node we called <SPAN CLASS="BodyComputer">
clock</SPAN>
in our behavioral model has been joined to a node (with automatically generated name) called <SPAN CLASS="BodyComputer">
CLKBUF_30:PAD</SPAN>
, using a net (connection) named <SPAN CLASS="BodyComputer">
DEF_NET_145</SPAN>
(again automatically generated). This net is the connection between the node <SPAN CLASS="BodyComputer">
clock</SPAN>
that is dangling in the behavioral model and the clock-buffer pad cell that the synthesis tools automatically added.</P>
</DIV>
<DIV>
<H2 CLASS="Heading2">
<A NAME="pgfId=38717">
</A>
13.7.3 Exit Delay</H2>
<P CLASS="BodyAfterHead">
<A NAME="pgfId=38947">
</A>
We now know that the clock-pad input is <SPAN CLASS="BodyComputer">
CLKBUF_30:PAD</SPAN>
, so we can find the exit delays (the longest path between clock-pad input and an output) as follows (using the clock-pad input as the start set):</P>
<P CLASS="ComputerFirst">
<A NAME="pgfId=38729">
</A>
timer> startset clockpad</P>
<P CLASS="Computer">
<A NAME="pgfId=38730">
</A>
Working startset 'clockpad' contains 0 pins.</P>
<P CLASS="Computer">
<A NAME="pgfId=38731">
</A>
</P>
<P CLASS="Computer">
<A NAME="pgfId=38732">
</A>
timer> addstart CLKBUF_30:PAD</P>
<P CLASS="ComputerLast">
<A NAME="pgfId=38733">
</A>
Working startset 'clockpad' contains 2 pins.</P>
<P CLASS="BodyAfterHead">
<A NAME="pgfId=38734">
</A>
I shall explain why this set contains two pins and not just one presently. Next, we define the end set and trace the longest exit paths as follows:</P>
<P CLASS="ComputerFirst">
<A NAME="pgfId=38735">
</A>
timer> endset outpad</P>
<P CLASS="Computer">
<A NAME="pgfId=38736">
</A>
Working endset 'outpad' contains 3 pins.</P>
<P CLASS="Computer">
<A NAME="pgfId=38737">
</A>
</P>
<P CLASS="Computer">
<A NAME="pgfId=38738">
</A>
timer> longest</P>
<P CLASS="Computer">
<A NAME="pgfId=38739">
</A>
1st longest path to all endpins</P>
<P CLASS="Computer">
<A NAME="pgfId=38740">
</A>
Rank Total Start pin First Net End Net End pin</P>
<P CLASS="Computer">
<A NAME="pgfId=38741">
</A>
0 16.1 CLKBUF_30/U0:PAD DEF_NET_144 DEF_NET_154 OUTBUF_33:PAD</P>
<P CLASS="Computer">
<A NAME="pgfId=38742">
</A>
1 16.0 CLKBUF_30/U0:PAD DEF_NET_144 DEF_NET_151 OUTBUF_32:PAD</P>
<P CLASS="Computer">
<A NAME="pgfId=38743">
</A>
2 16.0 CLKBUF_30/U0:PAD DEF_NET_144 DEF_NET_148 OUTBUF_31:PAD</P>
<P CLASS="ComputerLast">
<A NAME="pgfId=74878">
</A>
3 pins</P>
<P CLASS="BodyAfterHead">
<A NAME="pgfId=74879">
</A>
This tells us we have three paths from the clock-pad input to the three output pins (<SPAN CLASS="BodyComputer">
outp[0]</SPAN>
, <SPAN CLASS="BodyComputer">
outp[1]</SPAN>
, and <SPAN CLASS="BodyComputer">
outp[2]</SPAN>
). We can examine the longest exit delay in more detail as follows:</P>
<P CLASS="ComputerFirst">
<A NAME="pgfId=74880">
</A>
timer> expand 0</P>
<P CLASS="Computer">
<A NAME="pgfId=38747">
</A>
1st longest path to OUTBUF_33:PAD (rising) (Rank: 0)</P>
<P CLASS="Computer">
<A NAME="pgfId=38748">
</A>
Total Delay Typ Load Macro Start pin Net name</P>
<P CLASS="Computer">
<A NAME="pgfId=38749">
</A>
16.1 3.7 Tpd 0 OUTBUF OUTBUF_33:D DEF_NET_154</P>
<P CLASS="Computer">
<A NAME="pgfId=38750">
</A>
12.4 4.5 Tpd 1 DF1 outp_ff_b0:CLK DEF_NET_1530</P>
<P CLASS="ComputerLast">
<A NAME="pgfId=38751">
</A>
7.9 7.9 Tpd 16 CLKEXT_0 CLKBUF_30/U0:PAD DEF_NET_144</P>
<P CLASS="Body">
<A NAME="pgfId=71342">
</A>
The input-to-clock delay, t<SUB CLASS="Subscript">
IC</SUB>
, due to the clock-buffer cell (or macro) <SPAN CLASS="BodyComputer">
CLKEXT_0</SPAN>
, instance name <SPAN CLASS="BodyComputer">
CLKBUF_30/U0</SPAN>
, is 7.9 ns. The clock-to-Q delay, t<SUB CLASS="Subscript">
CQ</SUB>
, of flip-flop cell <SPAN CLASS="BodyComputer">
DF1</SPAN>
, instance name <SPAN CLASS="BodyComputer">
outp_ff_b0</SPAN>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -