ch12.c.htm

来自「介绍asci设计的一本书」· HTM 代码 · 共 443 行 · 第 1/2 页
HTM
443 行
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML EXPERIMENTAL 970324//EN">

<HTML>

<HEAD>

<META NAME="GENERATOR" CONTENT="Adobe FrameMaker 5.5/HTML Export Filter">



<TITLE> 12.12&nbsp;Optimization of the Viterbi Decoder</TITLE></HEAD><!--#include file="top.html"--><!--#include file="header.html"-->



<DIV>

<P>[&nbsp;<A HREF="CH12.htm">Chapter&nbsp;start</A>&nbsp;]&nbsp;[&nbsp;<A HREF="CH12.b.htm">Previous&nbsp;page</A>&nbsp;]&nbsp;[&nbsp;<A HREF="CH12.d.htm">Next&nbsp;page</A>&nbsp;]</P><!--#include file="AmazonAsic.html"--><HR></DIV>

<H1 CLASS="Heading1">

<A NAME="pgfId=262995">

 </A>

12.12&nbsp;<A NAME="28656">

 </A>

Optimization of the Viterbi Decoder</H1>

<P CLASS="BodyAfterHead">

<A NAME="pgfId=268408">

 </A>

Returning to the Viterbi decoder example (from Section 12.4), we first set the <SPAN CLASS="Definition">

environment</SPAN>

<A NAME="marker=269311">

 </A>

 for the design using the following worst-case conditions: a die temperature of 25<SPAN CLASS="Symbol">

&#8734;</SPAN>

C (fastest logic) to 120<SPAN CLASS="Symbol">

&#8734;</SPAN>

C (slowest logic); a power supply voltage of <SPAN CLASS="EquationVariables">

V</SPAN>

<SUB CLASS="Subscript">

DD</SUB>

  =  5.5  V (fastest logic) to <SPAN CLASS="EquationVariables">

V</SPAN>

<SUB CLASS="Subscript">

DD</SUB>

  =  4.5  V (slowest logic); and worst process (slowest logic) to best process (fastest logic). Assume that this ASIC should run at a clock frequency of at least 33  MHz (clock period of 30  ns). An initial synthesis run gives a critical path delay at nominal conditions (the default setting) of about 25  ns and nearly 35  ns under worst-case conditions using a high-density 0.6  <SPAN CLASS="Symbol">

m</SPAN>

m standard-cell target library. </P>

<P CLASS="Body">

<A NAME="pgfId=269147">

 </A>

Estimates (using simulation and calculation) show that data arrives at the input pins 5    ns (worst-case) after the rising edge of the clock. The reset signal arrives 10  ns (worst-case) after the rising edge of the clock. The outputs of the Viterbi decoder must be stable at least 4  ns before the rising edge of the clock. This allows these signals to be driven to another ASIC in time to be clocked. These timing constraints are particularly devastating. Together they effectively reduce the clock period that is available for use by 9  ns. However, these figures are typical for board-level delays.</P>

<P CLASS="Body">

<A NAME="pgfId=264788">

 </A>

The initial synthesis runs reveal the critical path is through the following six modules:</P>

<P CLASS="ComputerOneLine">

<A NAME="pgfId=269141">

 </A>

subset_decode -&gt; compute_metric -&gt; <BR>

compare_select -&gt; reduce -&gt; metric -&gt; output_decision</P>

<P CLASS="BodyAfterHead">

<A NAME="pgfId=264785">

 </A>

The logic synthesizer can do little or no optimization across these module boundaries. The next step, then, is to rearrange the design hierarchy for synthesis. <SPAN CLASS="Definition">

Flattening</SPAN>

<A NAME="marker=319750">

 </A>

 (<A NAME="marker=319751">

 </A>

merging or <A NAME="marker=319752">

 </A>

ungrouping) the six modules into a new cell, called <SPAN CLASS="BodyComputer">

critical</SPAN>

, allows the synthesizer to reduce the critical path delay by optimizing one large module. </P>

<P CLASS="Body">

<A NAME="pgfId=297751">

 </A>

At present the last module in the critical path is <SPAN CLASS="BodyComputer">

output_decision</SPAN>

. This combinational logic adds 2&#8211;3  ns to the output delay requirement of 4  ns (this means the outputs of the module <SPAN CLASS="BodyComputer">

metric</SPAN>

 must be stable 6&#8211;7  ns before the rising clock edge). Registering the output reduces this overhead and removes the module <SPAN CLASS="BodyComputer">

output_decision</SPAN>

 from the critical path. The disadvantage is an increase in latency by one clock cycle, but the latency is already 12 clock cycles in this design. If registering the output decreases the critical path delay by more than a factor of 12  /  13, performance will still improve.</P>

<P CLASS="Body">

<A NAME="pgfId=269241">

 </A>

To register the output, alter the code (on pages 575&#8211;576) as follows:</P>

<P CLASS="ComputerFirst">

<A NAME="pgfId=251151">

 </A>

<B CLASS="Keyword">

module</B>

 viterbi_ASIC</P>

<P CLASS="Computer">

<A NAME="pgfId=251156">

 </A>

... </P>

<P CLASS="Computer">

<A NAME="pgfId=251244">

 </A>

<B CLASS="Keyword">

wire</B>

 [2:0] Out, Out_r; // Change: add Out_r.</P>

<P CLASS="Computer">

<A NAME="pgfId=251175">

 </A>

...<B CLASS="Keyword">

 </B>

</P>

<P CLASS="Computer">

<A NAME="pgfId=251243">

 </A>

	asPadOut 					#(3,&quot;30,31,32&quot;) u30 (padOut, Out_r); // Change: Out_r.</P>

<P CLASS="Computer">

<A NAME="pgfId=251185">

 </A>

	Outreg o_1 (Out, Out_r, Clk, Res); // Change: add output register.</P>

<P CLASS="ComputerLast">

<A NAME="pgfId=251179">

 </A>

	...</P>

<P CLASS="ComputerLast">

<A NAME="pgfId=251896">

 </A>

<B CLASS="Keyword">

endmodule</B>

 </P>

<P CLASS="Computer">

<A NAME="pgfId=251897">

 </A>

<B CLASS="Keyword">

module</B>

 Outreg (Out, Out_r, Clk, Res); // Change: add this module.</P>

<P CLASS="Computer">

<A NAME="pgfId=251898">

 </A>

<B CLASS="Keyword">

input</B>

 [2:0] Out; <B CLASS="Keyword">

input</B>

 Clk, Rst; <B CLASS="Keyword">

output</B>

 [2:0] Out_r; </P>

<P CLASS="Computer">

<A NAME="pgfId=251882">

 </A>

	dff #(3) reg1(Out, Out_r, Clk, Res);</P>

<P CLASS="ComputerLast">

<A NAME="pgfId=251902">

 </A>

<B CLASS="Keyword">

endmodule</B>

 </P>

<P CLASS="BodyAfterHead">

<A NAME="pgfId=269126">

 </A>

These changes move the performance closer to the target. Prelayout estimates indicate the die perimeter required for the I/O pads will allow more than enough area to hold the core logic. Since there is unused area in the core, it makes sense to switch to a high-performance standard-cell library with a slightly larger cell height (96<SPAN CLASS="Symbol">

 l</SPAN>

 versus 72<SPAN CLASS="Symbol">

 l</SPAN>

). This cell library is less dense, but faster.</P>

<P CLASS="Body">

<A NAME="pgfId=251974">

 </A>

Typically, at this point, the design is improved by altering the HDL, the hierarchy, and the synthesis controls in an iterative manner until the desired performance is achieved. However, remember there is still no information from the layout. The best that can be done is to estimate the contribution of the interconnect using wire-load models. As soon as possible the netlist should be passed to the floorplanner (or the place-and-route software in the absence of a floorplanner) to generate better estimates of interconnect delays.</P>

<TABLE>

<TR>

<TD ROWSPAN="1" COLSPAN="2">

<P CLASS="TableTitle">

<A NAME="pgfId=393981">

 </A>

TABLE&nbsp;12.13&nbsp;<A NAME="26932">

 </A>

Critical-path timing report for the Viterbi decoder.</P>

</TD>

</TR>

<TR>

<TD ROWSPAN="1" COLSPAN="1">

<P CLASS="TableFirst">

<A NAME="pgfId=393985">

 </A>

Instance name</P>

</TD>

<TD ROWSPAN="1" COLSPAN="1">

<P CLASS="TableFirst">

<A NAME="pgfId=393990">

 </A>

Delay information<A HREF="#pgfId=393989" CLASS="footnote">

1</A>

</P>

</TD>

</TR>

<TR>

<TD ROWSPAN="1" COLSPAN="1">

<P CLASS="Computer">

<A NAME="pgfId=393992">

 </A>

v_1.u100</P>

<P CLASS="Computer">

<A NAME="pgfId=393993">

 </A>

&nbsp;</P>

<P CLASS="Computer">

<A NAME="pgfId=393994">

 </A>

u1.subout5.Q_ff_b0</P>

<P CLASS="Computer">

<A NAME="pgfId=393995">

 </A>

B1_i67 </P>

<P CLASS="Computer">

<A NAME="pgfId=393996">

 </A>

B1_i66 </P>

<P CLASS="Computer">

<A NAME="pgfId=393997">

 </A>

B1_i64 </P>

<P CLASS="Computer">

<A NAME="pgfId=393998">

 </A>

B1_i68 </P>

<P CLASS="Computer">

<A NAME="pgfId=393999">

 </A>

B1_i316</P>

<P CLASS="Computer">

<A NAME="pgfId=394000">

 </A>

u3.add_rip1.u4</P>

<P CLASS="Computer">
ch12.c.htm - 源码说明

本页面展示了「介绍asci设计的一本书」中的 ch12.c.htm 源码文件，采用 HTM 编程语言编写，共 443 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与asci相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?