📄 vpr_5.txt

📁 VPR布局布线源码
💻 TXT
📖 第 1 页 / 共 5 页
字号:

     -place_algorithm {bounding_box | net_timing_driven | path_timing_driven} 
          Controls the algorithm used by the placer. 
          Bounding_box focuses purely on minimizing the bounding box wirelength of the circuit. 
          Path_timing_driven focuses on minimizing both wirelength and the critical path delay. 
          Net_timing_driven is similar to path_timing_driven, but assumes that all nets have the same 
          delay when estimating the critical path during placement, rather than using the current 
          placement to obtain delay estimates. 
          Default: path_timing_driven. 

     -place_cost_type {linear | nonlinear} 
          Select the (wirelength portion of the) placement cost function. For FPGAs in which all channels 
          have the same width the linear cost function reduces to a bounding box wirelength cost 
          function. The nonlinear cost function, on the other hand, considers both wirelength and 
          congestion during placement. 
          Default: linear. 

          Note: Nonlinear is not supported this release and may give unusual results 

     -place_chan_width <int> 
          Can be used with the nonlinear cost function to tell VPR how many tracks a channel of relative 
          width 1 is expected to need to complete routing of this circuit. VPR will then place the circuit 
          only once, and repeatedly try routing the circuit as usual. If place_chan_width is not specified 
          and the nonlinear cost is used, VPR will replace and reroute the circuit for each channel width 
          at which it attempts to map the circuit. 


5.2.3 Placement Options Valid Only With Timing-Driven Placement 

     Timing Driven placement is used by default, unless the architecture file is missing timing 
information. 

     -timing_tradeoff <float> 
          Controls the trade-off between bounding box minimization and delay minimization in the placer. 
          A value of 0 makes the placer focus completely on bounding box (wirelength) minimization, 
          while a value of 1 makes the placer focus completely on timing optimization. 
          Default: 0.5. 

     -recompute_crit_iter <int> 
          Controls how many temperature updates occur before the placer performs a timing analysis to 
          update its estimate of the criticality of each connection. 
          Default: 1. 

     -inner_loop_recompute_divider <int> 
     Controls how many times the placer performs a timing analysis to update its criticality estimates 
     while at a single temperature. 
     Default: 0. 
     -td_place_exp_first <float> 
     Controls how critical a connection is considered as a function of its slack, at the start of the anneal. 
     If this value is 0, all connections are considered equally critical. If this value is large, connections 
     with small slacks are considered much more critical than connections with small slacks. As the 
     anneal progresses, the exponent used in the criticality computation gradually changes from its 
     starting value of td_place_exp_first to its final value of td_place_exp_last. 
     Default: 1. 
     -td_place_exp_last <float> 
     Controls how critical a connection is considered as a function of its slack, at the end of the anneal. 

   See discussion for -td_place_exp_first, above. 
   Default: 8. 

5.2.4 Router Options 


   -max_router_iterations <int> 
   The number of iterations of a Pathfinder-based router that will be executed before a circuit is 
   declared unrouteable (if it hasn’t routed successfully yet) at a given channel width. 
   Default: 50. 
   Speed-quality trade-off: reduce this number to speed up the router, at the cost of some increase in 
   final track count. This is most effective if -initial_pres_fac is simultaneously increased. 
   -initial_pres_fac <float> 
   Sets the starting value of the present overuse penalty factor. 
   Default: 0.5. 
   Speed-quality trade-off: increase this number to speed up the router, at the cost of some increase 
   in final track count. Values of 1000 or so are perfectly reasonable. 
   -first_iter_pres_fac <float> 
   Similar to -initial_pres_fac. This sets the present overuse penalty factor for the very first routing 
   iteration. -initial_pres_fac sets it for the second iteration. 
   Default: 0.5. 
   -pres_fac_mult <float> 
   Sets the growth factor by which the present overuse penalty factor is multiplied after each router 
   iteration. 
   Default: 1.3. 
   -acc_fac <float> 
         Specifies the accumulated overuse factor (historical congestion cost factor). 
         Default: 1. 

   -bb_factor <int> 
         Sets the distance (in channels) outside of the bounding box of its pins a route can go. Larger 
         numbers slow the router somewhat, but allow for a more exhaustive search of possible routes. 
         Default: 3. 

   -base_cost_type {demand_only | delay_normalized | intrinsic_delay} 
         Sets the basic cost of using a routing node (resource). Demand_only sets the basic cost of a 
         node according to how much demand is expected for that type of node. Delay_normalized is 
         similar, but normalizes all these basic costs to be of the same magnitude as the typical delay 
         through a routing resource. Intrinsic_delay sets the basic cost of a node to its intrinsic delay. 
         Default: delay_normalized for the timing-driven router and demand_only for the breadth-first 
         router. 
         Note: intrinsic_delay is not supported this release and may give unusual results 

   -bend_cost <float> 
         The cost of a bend. Larger numbers will lead to routes with fewer bends, at the cost of some 
         increase in track count. If only global routing is being performed, routes with fewer bends will 
         be easier for a detailed router to subsequently route onto a segmented routing architecture. 
         Default: 1 if global routing is being performed, 0 if combined global/detailed routing is being 
         performed. 

   -route_type {global | detailed} 
         Specifies whether global routing or combined global and detailed routing should be performed. 

    Default: detailed (i.e. combined global and detailed routing). 

-route_chan_width <int> 
    Tells VPR to route the circuit with a certain channel width.         No binary search on channel 
    capacity will be performed to find the minimum number of tracks required for routing -- VPR 
    simply reports whether or not the circuit will route at this channel width. 

-router_algorithm {breadth_first | timing_driven | directed_search} 
    Selects which router algorithm to use. The breadth-first router focuses solely on routing a 
    design successfully, while the timing-driven router focuses both on achieving a successful 
    route and achieving good circuit speed. The breadth-first router is capable of routing a design 
    using slightly fewer tracks than the timing-driving router (typically 5% if the timing- driven router 
    uses its default parameters; this can be reduced to about 2% if the router parameters are set 
    so the timing-driven router pays more attention to routability and less to area). The designs 
    produced by the timing-driven router are much faster, however, (2x - 10x) and it uses less CPU 
    time to route. The directed_search router is routability-driven and uses an A* heuristic to 
    improve runtime over breadth_first. 
    Default: timing_driven. 

5.2.5 Timing-Driven Router Options 

     -astar_fac <float> 
           Sets how aggressive the directed search used by the timing-driven router is. Values between 1 
           and 2 are reasonable, with higher values trading some quality for reduced CPU time. 
           Default: 1.2. 

     -max_criticality <float> 
           Sets the maximum fraction of routing cost that can come from delay (vs. coming from 
           routability) for any net. A value of 0 means no attention is paid to delay; a value of 1 means 
           nets on the critical path pay no attention to congestion. 
           Default: 0.99. 

     -criticality_exp <float> 
     Controls the delay - routability tradeoff for nets as a function of their slack. If this value is 0, all 
     nets are treated the same, regardless of their slack. If it is very large, only nets on the critical path 
     will be routed with attention paid to delay. Other values produce more moderate tradeoffs. 
     Default: 1. 


6.        File Formats 

     In all the file format that follow, a sharp (#) character anywhere in a line indicates that the 
rest of the line is a comment, while a backslash (\) at the end of a line (and not in a comment) 
means that this line is continued on the line below. 

6.1       Circuit Netlist (.net) Format 

     Three different circuit elements are available: input pads, output pads, and functional 
blocks.      Input and output pads are specified using the keywords .input and .output while 
functional blocks are specified by .[name], respectively. The .[name] for the functional block 
must correspond with the .[name] specified in the architecture file. For example, .clb in the 
netlist is specified by a .clb in the architecture file. The format is shown below. 

element_type_keyword blockname 
    pinlist: net_a net_b net_c ... 
    subblock: subblock_name pin_num1 pin_num2 ...                         # Only needed if a 
functional block 

     A circuit element is created by specifying a keyword at the start of a line, followed by the 
name to be used to identify this block. The line immediately below this keyword line starts with 
the identifier pinlist: and then lists the names of the nets connected to each pin of the functional 
block or pad. Input and output pads (.inputs and .outputs) have only one pin, while functional 
blocks (.[name]) have as many pins as the architecture file used for this run of VPR specifies. 
The first net listed in the pinlist connects to pin 0 of a functional block, and so on. If some pin of 
a functional block is to be left unconnected, the corresponding entry in the pinlist should specify 
the reserved word open instead of a net name. 
     Functional blocks (.[name]) also have to specify the internal contents of the functional block 
with subblock lines. Each functional block must have at least one subblock line, and can have 
up to max_subblocks attribute, where max_subblocks is set in the architecture file. A functional 
block may have less than max_subblocks subblock lines, since some of the subblocks in the 
functional block may be unused. Each subblock is a K-input O-output boolean logic element 
(BLE) (where K is set via the max_subblock_inputs attribute and O is set via the 

max_subblock_outputs attribute in the architecture description file) and a flip flop, as shown in 
Figure . The subblock line first gives the name of the subblock, and then gives the functional 
block pin or a subblock output pin within this functional block to which each BLE pin is 
connected. If a BLE pin is unconnected, the corresponding pin entry should be set to the 
keyword open. The of the max_subblock_inputs input pins, 
                           order            BLE     pins    is: 
max_subblock_outputs        output   pins,    and     the    clock    input    (max_subblock_inputs     + 
max_subblock_outputs + 1 pins total). 
     Each of the subblock BLE input pins can be connected to any of the functional block input 
pins, or to the output of any of the subblocks in this functional block. A connection to a 
functional block input pin is specified by giving the number of the functional block pin in the 
appropriate place, while a connection to a subblock output is specified by 
“ble_<subblock_number>”. For example, to connect to functional block pin 0, one lists 0 in the 
appropriate place, while to connect to the output of subblock 0, one lists ble_0 in the 
appropriate place. Each subblock clock pin can similarly be connected to either a clb input pin 
or the output of a subblock in the same logic block. If the subblock clock pin is “open” all the 
BLE outputs are unregistered outputs; otherwise all the BLE output are assumed to be 
registered. The entry corresponding to the subblock output pin specifies the number of the 
functional block output pin to which it connects, or open if this subblock output is doesn’t 
connect to any clb output pin (which happens when a subblock output is used only locally, 
within a logic block). 
     The only other keyword is .global. Use .global lines to specify that a net or nets should not 
be considered by the placement cost function or routed. It is assumed that some global routing 
resources exist to route these very high fanout signals (generally clocks). The syntax of the 
.global statement is: 


.global net_a net_b ... 



     An example netlist in which the logic block is a single BLE is given below. 

#This netlist describes a small circuit with two inputs 
#and one output. There is only one clb block, which is 
#a 3-input BLE (LUT+FF) that has one unconnected input. 
#This netlist assumes that the architecture input file defines 
#a clb as a 3-input BLE with pins 0, 1, and 2 being the LUT inputs, 
#pin 3 being the LUT output, and pin 4 being the BLE clock. 

.input a                               #Input pad. 
        pinlist: a                     #Blocks can have the same 
                                      #name as nets with no conflict. 

.input bpad 
        pinlist: b 

.clb simple                            # Logic block. 
        pinlist: a b open and2 open                    # 2 LUT inputs used, 
                                                      # clock input unconnected. 
        subblock: sb_one 0 1 open 3 open               # Subblock line says the 
                                                      # same thing.
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -