📄 descript.txt
字号:
Version 5.00, March 1, 2008
- Andy Ye's modification to include support for the uni-directional and
single-drive routing architectures.
- Andy Ye's modification to include support for single-tile based
architectures.
- Mark Wei Fang's modifications for mux balancing
- Jason Luu and Ted Campbell's XML architecture format
- Jason Luu and Ted Campbell's heterogeneous structures
Version 4.30, March 25, 2000
- Sandy Marquardt's timing-driven placement enhancements are in this
version. Previous versions of VPR had only routability-driven placement.
I cleaned up various minor things: removed an unused #define NEVER from
main.c, set the date in the title bar, and fixed a couple of mistakes in
the command line processor's checking for invalid combinations of
options. I regression tested this code fairly thoroughly against the
last release.
This version was released on the web.
Version 4.22, Jan. 26, 1999
- Changes "pr.h" to "vpr_types.h" and "ext.h" to "globals.h".
The old names were around just for historical reasons; the new filenames
are more clear.
- Changed the input netlist format slightly so that subblock output pins
can be hooked directly to unused (open) CLB output pins. When the netlist
specifies this, it means that a CLB OPIN of that class must be used by
that subblock, since the subblock is directly connected to an OPIN. This
correctly models Altera LABs, for example. The router has been altered so
that it ensures any "locally used CLB OPINs" specified by the netlist are
properly reserved (i.e. are not used by other connections).
This change lets me correctly model both types of CLB output pin logical
equivalence. One equivalence type corresponds to there being muxes between
a set of subblocks in a logic block and a set of CLB output pins. In this
case, if some of the subblock outputs don't have to go outside the CLB,
the other subblocks can use more than one CLB OPIN if it helps routability
(for routing a high-fanout net for example). There is another type of
logically equivalent outputs, however. If the subblock outputs are hooked
directly to the CLB outputs, but a set of subblocks are all identical and
muxes let you make any connection you want to their inputs, then all
the CLB OPINs connected to these subblocks are logically equivalent. In this
case, however, a subblock whose output is only used locally (never goes
outside the CLB) still consumes the OPIN connected to it, so other
subblocks can not use this OPIN. This is the case in Altera LABS, for
example. The changes I've made let the netlist specify either type of
logical equivalence (or even a mix of both types).
- Made the delay of a routing switch 1 second when doing a global routing
(old value was 0). A routing resource must have a delay > 0 for the
timing-driven router to work properly.
Version 4.21, Nov. 19, 1998
- Freed the button array in close_graphics (graphics.c) -- prevents a small
memory leak that would otherwise occur if VPR started and shut down the
graphics window several times. Noticed by Paul Leventis.
- Changed place_cost_type from int to an enumerated type (just for
cleanliness).
- Changed read_arch.c so purify doesn't complain about me printing
out unset values in print_arch (the values were uninitialized only
when they weren't relevant for the architecture used, so this is
just a cosmetic cleanup).
Version 4.20, Sept. 10, 1998
- Simplified equation used to count transistors in pass transistor muxes --
Sandy noticed that the "approximate" equation I listed was actually exact,
and simpler than the summation used in VPR.
- Changed all "class" variables to iclass so the code compiles with a C++
compiler. Paul Leventis wanted this.
- Changed draw.c to correctly draw architectures in which connected x-directed
(or y-directed) wire segments can overlap some in a channel. Needed by
Steve Trimberger for some of the architectures he's looking at.
Version 4.19, July 7, 1998
- Put an fabs around one of the tests for perturbing an input switch
pattern (in rr_graph.c). The lack of an fabs meant that I was
perturbing the switch patterns half the time when I didn't need to.
That could have hurt routability a little bit, although probably not
a whole lot.
- Made a CLB output pin driving a global net only a warning when the
CLB pin isn't global. This allows locally generated clocks to be
generated by CLBs, and put on global resources. Paul Leventis wanted
this so he could run one of his new circuits (des) without changing the
architecture file around.
Version 4.18, June 1, 1998
- Changed the rr_graph generator so that it perturbs the input pin switch
pattern whenever the Fc_input is a perfect multiple of Fc_output (so that
the output and input switch patterns are prevented from lining up perfectly).
- Added -timing_analysis_only_with_net_delay <float> parameter. This let's
you just use VPR's timing analyzer from the command line, with the delay
of each net set to a constant and other delays taken from the architecture
file. Sandy and Catherine Wong both wanted this.
Version 4.17, April 30, 1998
- Fixed a minor problem in the rr_graph generator -- the perturbed IPIN
connection block routine could create two switches from the last track
in a channel to one IPIN under certain (weird) conditions (basically very
low track counts). Fixed it by changing a min to a mod.
- Changed the information printed out about the critical path slightly -- now
counts of normal nets and of global nets (i.e. the clock) on the critical
path are computed and printed out separately.
- Cleaned up the code some by making chan_width_io, chan_x_dist and chan_y_dist
part of the chan_width_dist structure and passing that structure around to
the places that need it, rather than having them be global variables.
These variables weren't used widely enough to justify their being global.
- Cleaned up the code a bit by moving a few functions that didn't really
belong in place.c into place_and_route.c. This file now contains the
overall control routines that start the placer and router, etc. Place.c
now just has the placer itself in it.
Version 4.16, April 20, 1998
- Fixed bug where architectures with both pass transistors and buffered
segments, and paths from buffered to unbuffered segments and back again,
could go into an infinite loop in the router. Problem: the cost is not
monotonic in my directed-timing driven router under certain cases. The
cost of an unbuffered node can be less after you go from it to a buffered
segment and back, due to the reduction in upstream resistance. To fix it,
I remember the backward_path_cost back to the start of the routing
connection, and never allow a node to be re-expanded if it would make the
backward_path_cost go down. Now loops can never occur.
Version 4.15, April 15, 1998
- Added extra info to the timing graph so I can tell what kind of resource
each timing node is. This let's me print out and display the critical
path. Added code to print out the critical path and graphically display
it.
Version 4.14, April 9, 1998
- Added the optional keyword "global" to CLB inpin statements in the .arch
file. This let's pins used only for global signals (e.g. clocks) be
flagged as special pins that shouldn't connect into the general purpose
routing. This stops muxes, etc. from being built for them, so the area
numbers are more accurate. Also, it means they don't disrupt the input
pin switch pattern -- since input pin switches are evenly distributed
across the tracks, for low Fc's hooking the clock ipins into the normal
routing meant some tracks connected to the clock pin and not to any of the
normal ipins.
The netlist checker has been beefed up to check that global signals only
connect to global CLB pins and so on.
- Changed the switch pattern used for CLB input pins when Fc_input = Fc_output.
The old VPR would make perfectly regular switch patterns, so when Fc_input
= Fc_output, an output could only talk to certain inputs if Fc was low.
This would result in an FPGA with very poor routeability (at least if
the switch box was subset (planar) -- the Wilton switch box would probably
work OK even with this IPIN switch pattern since it let's every track get
to every other one).
The change checks for the case Fc_input = Fc_output and perturbs the switch
pattern for CLB input pins to make it different than the output pin pattern.
Hence an output can talk to more input pins, and the FPGA is more
routable. The perturbation is as small as I could think to make it, since
I still wanted the switches nicely distributed over the tracks and logically
equivalent inputs on the same side should try to hit different tracks.
Version 4.13, March 26, 1998
- Changed the router defaults (acc_fac, first_iter_pres_fac, pres_fac_mult)
to the values my experimentation determined were the best (from experiments
on a buffered, unit-length wire architecture). Also, the timing-driven
router now uses acc_fac = 0 for the first router iteration.
- Changed the area model slightly. It now assumes that the buffers from tracks
to ipin_cblocks are 4x minimum width, as are the buffers from the output
muxes to logic block input pins.
- Changed rr_graph_timing_params.c so it shares the pull-up, pull-down part
of tri-state buffer switches in the routing. This means the input capacitance
of a buffer is added into a node's capacitance only once at a given (i,j)
location for that node, since other switches at the same spot using buffers
will share that buffer.
Version 4.12, Jan. 26, 1998
- Added code to allow several different base cost types; some appropriate
for area-based routing, some appropriate for timing-driven routing.
- Wrote the timing-driven router. It uses an A-star directed search algorithm
and keeps track of the Elmore delay of each node in the partial route
tree as it constructs it. Everything is done except the dynamic base cost
changing net by net.
- Split route.c into route_common.c, route_timing.c and route_breadth_first.c.
- Code cleanup. Removed the net_block_pin_num array and made that data a
part of the net structure (member blk_pin). Changed the name of net.pins
to net.blocks. Removed the net.tempcost and net.ncost members (since they
were only used by the placer) and added two static arrays to place.c instead.
- Changed rr_base_cost to a member of the rr_indexed_data structure. Anything
that is the same for all segments of a given type can be stored in this
structure to save data. Put various timing values and such for quick
computation of expected costs to a target here for use by the timing-driven
router.
- Changed segment_stats.c to use the length information stored in rr_indexed_
data, so it doesn't need local static stuff anymore.
- Fixed a minor bug in read_arch.c -- I wasn't setting the loneline parameter
properly when setting up an FPGA architecture for global routing only.
Thanks to Russ Tessier at MIT for finding this.
Version 4.11, Dec. 4, 1997
- Stopped storing the cost of an rr_node in the router, and now store only
acc_cost and pres_cost. This enables dynamic costing of resources by only
changing the rr_base_cost array. The cost of a node is then computed as each
node is expanded during routing. This slows the router down by 7%.
Version 4.10, Dec. 3, 1997
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -