📄 rfc1058.txt
字号:
the best route to any given destination is remembered. If the
gateway involved in that route should crash, or the network
connection to it break, the calculation might never reflect the
change. The algorithm as shown so far depends upon a gateway
notifying its neighbors if its metrics change. If the gateway
crashes, then it has no way of notifying neighbors of a change.
In order to handle problems of this kind, distance vector protocols
must make some provision for timing out routes. The details depend
upon the specific protocol. As an example, in RIP every gateway that
participates in routing sends an update message to all its neighbors
once every 30 seconds. Suppose the current route for network N uses
gateway G. If we don't hear from G for 180 seconds, we can assume
that either the gateway has crashed or the network connecting us to
it has become unusable. Thus, we mark the route as invalid. When we
hear from another neighbor that has a valid route to N, the valid
route will replace the invalid one. Note that we wait for 180
seconds before timing out a route even though we expect to hear from
each neighbor every 30 seconds. Unfortunately, messages are
occasionally lost by networks. Thus, it is probably not a good idea
to invalidate a route based on a single missed message.
As we will see below, it is useful to have a way to notify neighbors
that there currently isn't a valid route to some network. RIP, along
with several other protocols of this class, does this through a
normal update message, by marking that network as unreachable. A
specific metric value is chosen to indicate an unreachable
destination; that metric value is larger than the largest valid
metric that we expect to see. In the existing implementation of RIP,
16 is used. This value is normally referred to as "infinity", since
it is larger than the largest valid metric. 16 may look like a
surprisingly small number. It is chosen to be this small for reasons
that we will see shortly. In most implementations, the same
convention is used internally to flag a route as invalid.
2.2. Preventing instability
The algorithm as presented up to this point will always allow a host
or gateway to calculate a correct routing table. However, that is
still not quite enough to make it useful in practice. The proofs
referred to above only show that the routing tables will converge to
the correct values in finite time. They do not guarantee that this
time will be small enough to be useful, nor do they say what will
happen to the metrics for networks that become inaccessible.
It is easy enough to extend the mathematics to handle routes becoming
inaccessible. The convention suggested above will do that. We
choose a large metric value to represent "infinity". This value must
be large enough that no real metric would ever get that large. For
the purposes of this example, we will use the value 16. Suppose a
network becomes inaccessible. All of the immediately neighboring
gateways time out and set the metric for that network to 16. For
purposes of analysis, we can assume that all the neighboring gateways
have gotten a new piece of hardware that connects them directly to
the vanished network, with a cost of 16. Since that is the only
connection to the vanished network, all the other gateways in the
system will converge to new routes that go through one of those
gateways. It is easy to see that once convergence has happened, all
the gateways will have metrics of at least 16 for the vanished
network. Gateways one hop away from the original neighbors would end
up with metrics of at least 17; gateways two hops away would end up
with at least 18, etc. As these metrics are larger than the maximum
metric value, they are all set to 16. It is obvious that the system
will now converge to a metric of 16 for the vanished network at all
gateways.
Unfortunately, the question of how long convergence will take is not
amenable to quite so simple an answer. Before going any further, it
will be useful to look at an example (taken from [2]). Note, by the
way, that what we are about to show will not happen with a correct
implementation of RIP. We are trying to show why certain features
are needed. Note that the letters correspond to gateways, and the
lines to networks.
A-----B
\ / \
\ / |
C / all networks have cost 1, except
| / for the direct link from C to D, which
|/ has cost 10
D
|<=== target network
Each gateway will have a table showing a route to each network.
However, for purposes of this illustration, we show only the routes
from each gateway to the network marked at the bottom of the diagram.
D: directly connected, metric 1
B: route via D, metric 2
C: route via B, metric 3
A: route via B, metric 3
Now suppose that the link from B to D fails. The routes should now
adjust to use the link from C to D. Unfortunately, it will take a
while for this to this to happen. The routing changes start when B
notices that the route to D is no longer usable. For simplicity, the
chart below assumes that all gateways send updates at the same time.
The chart shows the metric for the target network, as it appears in
the routing table at each gateway.
time ------>
D: dir, 1 dir, 1 dir, 1 dir, 1 ... dir, 1 dir, 1
B: unreach C, 4 C, 5 C, 6 C, 11 C, 12
C: B, 3 A, 4 A, 5 A, 6 A, 11 D, 11
A: B, 3 C, 4 C, 5 C, 6 C, 11 C, 12
dir = directly connected
unreach = unreachable
Here's the problem: B is able to get rid of its failed route using a
timeout mechanism. But vestiges of that route persist in the system
for a long time. Initially, A and C still think they can get to D
via B. So, they keep sending updates listing metrics of 3. In the
next iteration, B will then claim that it can get to D via either A
or C. Of course, it can't. The routes being claimed by A and C are
now gone, but they have no way of knowing that yet. And even when
they discover that their routes via B have gone away, they each think
there is a route available via the other. Eventually the system
converges, as all the mathematics claims it must. But it can take
some time to do so. The worst case is when a network becomes
completely inaccessible from some part of the system. In that case,
the metrics may increase slowly in a pattern like the one above until
they finally reach infinity. For this reason, the problem is called
"counting to infinity".
You should now see why "infinity" is chosen to be as small as
possible. If a network becomes completely inaccessible, we want
counting to infinity to be stopped as soon as possible. Infinity
must be large enough that no real route is that big. But it
shouldn't be any bigger than required. Thus the choice of infinity
is a tradeoff between network size and speed of convergence in case
counting to infinity happens. The designers of RIP believed that the
protocol was unlikely to be practical for networks with a diameter
larger than 15.
There are several things that can be done to prevent problems like
this. The ones used by RIP are called "split horizon with poisoned
reverse", and "triggered updates".
2.2.1. Split horizon
Note that some of the problem above is caused by the fact that A and
C are engaged in a pattern of mutual deception. Each claims to be
able to get to D via the other. This can be prevented by being a bit
more careful about where information is sent. In particular, it is
never useful to claim reachability for a destination network to the
neighbor(s) from which the route was learned. "Split horizon" is a
scheme for avoiding problems caused by including routes in updates
sent to the gateway from which they were learned. The "simple split
horizon" scheme omits routes learned from one neighbor in updates
sent to that neighbor. "Split horizon with poisoned reverse"
includes such routes in updates, but sets their metrics to infinity.
If A thinks it can get to D via C, its messages to C should indicate
that D is unreachable. If the route through C is real, then C either
has a direct connection to D, or a connection through some other
gateway. C's route can't possibly go back to A, since that forms a
loop. By telling C that D is unreachable, A simply guards against
the possibility that C might get confused and believe that there is a
route through A. This is obvious for a point to point line. But
consider the possibility that A and C are connected by a broadcast
network such as an Ethernet, and there are other gateways on that
network. If A has a route through C, it should indicate that D is
unreachable when talking to any other gateway on that network. The
other gateways on the network can get to C themselves. They would
never need to get to C via A. If A's best route is really through C,
no other gateway on that network needs to know that A can reach D.
This is fortunate, because it means that the same update message that
is used for C can be used for all other gateways on the same network.
Thus, update messages can be sent by broadcast.
In general, split horizon with poisoned reverse is safer than simple
split horizon. If two gateways have routes pointing at each other,
advertising reverse routes with a metric of 16 will break the loop
immediately. If the reverse routes are simply not advertised, the
erroneous routes will have to be eliminated by waiting for a timeout.
However, poisoned reverse does have a disadvantage: it increases the
size of the routing messages. Consider the case of a campus backbone
connecting a number of different buildings. In each building, there
is a gateway connecting the backbone to a local network. Consider
what routing updates those gateways should broadcast on the backbone
network. All that the rest of the network really needs to know about
each gateway is what local networks it is connected to. Using simple
split horizon, only those routes would appear in update messages sent
by the gateway to the backbone network. If split horizon with
poisoned reverse is used, the gateway must mention all routes that it
learns from the backbone, with metrics of 16. If the system is
large, this can result in a large update message, almost all of whose
entries indicate unreachable networks.
In a static sense, advertising reverse routes with a metric of 16
provides no additional information. If there are many gateways on
one broadcast network, these extra entries can use significant
bandwidth. The reason they are there is to improve dynamic behavior.
When topology changes, mentioning routes that should not go through
the gateway as well as those that should can speed up convergence.
However, in some situations, network managers may prefer to accept
somewhat slower convergence in order to minimize routing overhead.
Thus implementors may at their option implement simple split horizon
rather than split horizon with poisoned reverse, or they may provide
a configuration option that allows the network manager to choose
which behavior to use. It is also permissible to implement hybrid
schemes that advertise some reverse routes with a metric of 16 and
omit others. An example of such a scheme would be to use a metric of
16 for reverse routes for a certain period of time after routing
changes involving them, and thereafter omitting them from updates.
2.2.2. Triggered updates
Split horizon with poisoned reverse will prevent any routing loops
that involve only two gateways. However, it is still possible to end
up with patterns in which three gateways are engaged in mutual
deception. For example, A may believe it has a route through B, B
through C, and C through A. Split horizon cannot stop such a loop.
This loop will only be resolved when the metric reaches infinity and
the network involved is then declared unreachable. Triggered updates
are an attempt to speed up this convergence. To get triggered
updates, we simply add a rule that whenever a gateway changes the
metric for a route, it is required to send update messages almost
immediately, even if it is not yet time for one of the regular update
message. (The timing details will differ from protocol to protocol.
Some distance vector protocols, including RIP, specify a small time
delay, in order to avoid having triggered updates generate excessive
network traffic.) Note how this combines with the rules for
computing new metrics. Suppose a gateway's route to destination N
goes through gateway G. If an update arrives from G itself, the
receiving gateway is required to believe the new information, whether
the new metric is higher or lower than the old one. If the result is
a change in metric, then the receiving gateway will send triggered
updates to all the hosts and gateways directly connected to it. They
in turn may each send updates to their neighbors. The result is a
cascade of triggered updates. It is easy to show which gateways and
hosts are involved in the cascade. Suppose a gateway G times out a
route to destination N. G will send triggered updates to all of its
neighbors. However, the only neighbors who will believe the new
information are those whose routes for N go through G. The other
gateways and hosts will see this as information about a new route
that is worse than the one they are already using, and ignore it.
The neighbors whose routes go through G will update their metrics and
send triggered updates to all of their neighbors. Again, only those
neighbors whose routes go through them will pay attention. Thus, the
triggered updates will propagate backwards along all paths leading to
gateway G, updating the metrics to infinity. This propagation will
stop as soon as it reaches a portion of the network whose route to
destination N takes some other path.
If the system could be made to sit still while the cascade of
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -