📄 rfc3345.txt
字号:
Network Working Group D. McPherson
Request for Comments: 3345 TCB
Category: Informational V. Gill
AOL Time Warner, Inc.
D. Walton
A. Retana
Cisco Systems, Inc.
August 2002
Border Gateway Protocol (BGP) Persistent Route Oscillation Condition
Status of this Memo
This memo provides information for the Internet community. It does
not specify an Internet standard of any kind. Distribution of this
memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (2002). All Rights Reserved.
Abstract
In particular configurations, the BGP scaling mechanisms defined in
"BGP Route Reflection - An Alternative to Full Mesh IBGP" and
"Autonomous System Confederations for BGP" will introduce persistent
BGP route oscillation. This document discusses the two types of
persistent route oscillation that have been identified, describes
when these conditions will occur, and provides some network design
guidelines to avoid introducing such occurrences.
1. Introduction
The Border Gateway Protocol (BGP) is an inter-Autonomous System
routing protocol. The primary function of a BGP speaking system is
to exchange network reachability information with other BGP systems.
In particular configurations, the BGP [1] scaling mechanisms defined
in "BGP Route Reflection - An Alternative to Full Mesh IBGP" [2] and
"Autonomous System Confederations for BGP" [3] will introduce
persistent BGP route oscillation.
The problem is inherent in the way BGP works: locally defined routing
policies may conflict globally, and certain types of conflicts can
cause persistent oscillation of the protocol. Given current
practices, we happen to see the problem manifest itself in the
context of MED + route reflectors or confederations.
McPherson, et al. Informational [Page 1]
RFC 3345 BGP Persistent Route Oscillation Condition August 2002
The current specification of BGP-4 [4] states that the
MULTI_EXIT_DISC is only comparable between routes learned from the
same neighboring AS. This limitation is consistent with the
description of the attribute: "The MULTI_EXIT_DISC attribute may be
used on external (inter-AS) links to discriminate among multiple exit
or entry points to the same neighboring AS." [1,4]
In a full mesh iBGP network, all the internal routers have complete
visibility of the available exit points into a neighboring AS. The
comparison of the MULTI_EXIT_DISC for only some paths is not a
problem.
Because of the scalability implications of a full mesh iBGP network,
two alternatives have been standardized: route reflectors [2] and AS
confederations [3]. Both alternatives describe methods by which
route distribution may be achieved without a full iBGP mesh in an AS.
The route reflector alternative defines the ability to re-advertise
(reflect) iBGP-learned routes to other iBGP peers once the best path
is selected [2]. AS Confederations specify the operation of a
collection of autonomous systems under a common administration as a
single entity (i.e. from the outside, the internal topology and the
existence of separate autonomous systems are not visible). In both
cases, the reduction of the iBGP full mesh results in the fact that
not all the BGP speakers in the AS have complete visibility of the
available exit points into a neighboring AS. In fact, the visibility
may be partial and inconsistent depending on the location (and
function) of the router in the AS.
In certain topologies involving either route reflectors or
confederations (detailed description later in this document), the
partial visibility of the available exit points into a neighboring AS
may result in an inconsistent best path selection decision as the
routers don't have all the relevant information. If the
inconsistencies span more than one peering router, they may result in
a persistent route oscillation. The best path selection rules
applied in this document are consistent with the current
specification [4].
The persistent route oscillation behavior is deterministic and can be
avoided by employing some rudimentary BGP network design principles
until protocol enhancements resolve the problem.
In the following sections a taxonomy of the types of oscillations is
presented and a description of the set of conditions that will
trigger route oscillations is given. We continue by providing
several network design alternatives that remove the potential of this
occurrence.
McPherson, et al. Informational [Page 2]
RFC 3345 BGP Persistent Route Oscillation Condition August 2002
It is the intent of the authors that this document serve to increase
operator awareness of the problem, as well as to trigger discussion
and subsequent proposals for potential protocol enhancements that
remove the possibility of this to occur.
The oscillations are classified into Type I and Type II depending
upon the criteria documented below.
2. Discussion of Type I Churn
In the following two subsections we provide configurations under
which Type I Churn will occur. We begin with a discussion of the
problem when using Route Reflection, and then discuss the problem as
it relates to AS Confederations.
In general, Type I Churn occurs only when BOTH of the following
conditions are met:
1) a single-level Route Reflection or AS Confederations design is
used in the network AND
2) the network accepts the BGP MULTI_EXIT_DISC (MED) attribute
from two or more ASs for a single prefix and the MED values are
unique.
It is also possible for the non-deterministic ordering of paths to
cause the route oscillation problem. [1] does not specify that paths
should be ordered based on MEDs but it has been proven that non-
deterministic ordering can lead to loops and inconsistent routing
decisions. Most vendors have either implemented deterministic
ordering as default behavior, or provide a knob that permits the
operator to configure the router to order paths in a deterministic
manner based on MEDs.
McPherson, et al. Informational [Page 3]
RFC 3345 BGP Persistent Route Oscillation Condition August 2002
2.1. Route Reflection and Type I Churn
We now discuss Type I oscillation as it relates to Route Reflection.
To begin, consider the topology depicted in Figure 1:
---------------------------------------------------------------
/ -------------------- -------------------- \
| / \ / \ |
| | Cluster 1 | | Cluster 2 | |
| | | | | |
| | | *1 | | |
| | Ra(RR) . . . . . . . . . . . . . . Rd(RR) | |
| | . . | | . | |
| | .*5 .*4 | | .*12 | |
| | . . | | . | |
| | Rb(C) Rc(C) | | Re(C) | |
| | . . | | . | |
| \ . . / \ . / |
| ---.------------.--- ---------.---------- |
\ .(10) .(1) AS1 .(0) /
-------.------------.---------------------------.--------------
. . .
------ . ------------ .
/ \ . / \ .
| AS10 | | AS6 |
\ / \ /
------ ------------
. .
. .
. --------------
. / \
| AS100 |- 10.0.0.0/8
\ /
--------------
Figure 1: Example Route Reflection Topology
In Figure 1 AS1 contains two Route Reflector Clusters, Clusters 1 and
2. Each Cluster contains one Route Reflector (RR) (i.e., Ra and Rd,
respectively). An associated 'RR' in parentheses represents each RR.
Cluster 1 contains two RR Clients (Rb and Rc), and Cluster 2 contains
one RR Client (Re). An associated 'C' in parentheses indicates RR
Client status. The dotted lines are used to represent BGP peering
sessions.
The number contained in parentheses on the AS1 EBGP peering sessions
represents the MED value advertised by the peer to be associated with
the 10.0.0.0/8 network reachability advertisement.
McPherson, et al. Informational [Page 4]
RFC 3345 BGP Persistent Route Oscillation Condition August 2002
The number following each '*' on the IBGP peering sessions represents
the additive IGP metrics that are to be associated with the BGP
NEXT_HOP attribute for the concerned route. For example, the Ra IGP
metric value associated with a NEXT_HOP learned via Rb would be 5;
while the metric value associated with the NEXT_HOP learned via Re
would be 13.
Table 1 depicts the 10.0.0.0/8 route attributes as seen by routers
Rb, Rc and Re, respectively. Note that the IGP metrics in Figure 1
are only of concern when advertising the route to an IBGP peer.
Router MED AS_PATH
--------------------
Rb 10 10 100
Rc 1 6 100
Re 0 6 100
Table 1: Route Attribute Table
For the following steps 1 through 5, the best path will be marked
with a '*'.
1) Ra has the following installed in its BGP table, with the path
learned via AS2 marked best:
NEXT_HOP
AS_PATH MED IGP Cost
-----------------------
6 100 1 4
* 10 100 10 5
The '10 100' route should not be marked as best, though this is
not the cause of the persistent route oscillation. Ra realizes
it has the wrong route marked as best since the '6 100' path
has a lower IGP metric. As such, Ra makes this change and
advertises an UPDATE message to its neighbors to let them know
that it now considers the '6 100, 1, 4' route as best.
2) Rd receives the UPDATE from Ra, which leaves Rd with the
following installed in its BGP table:
NEXT_HOP
AS_PATH MED IGP Cost
-----------------------
* 6 100 0 12
6 100 1 5
McPherson, et al. Informational [Page 5]
RFC 3345 BGP Persistent Route Oscillation Condition August 2002
Rd then marks the '6 100, 0, 12' route as best because it has a
lower MED. Rd sends an UPDATE message to its neighbors to let
them know that this is the best route.
3) Ra receives the UPDATE message from Rd and now has the
following in its BGP table:
NEXT_HOP
AS_PATH MED IGP Cost
-----------------------
6 100 0 13
6 100 1 4
* 10 100 10 5
The first route (6 100, 0, 13) beats the second route (6 100,
1, 4) because of a lower MED. Then the third route (10 100,
10, 5) beats the first route because of lower IGP metric to
NEXT_HOP. Ra sends an UPDATE message to its peers informing
them of the new best route.
4) Rd receives the UPDATE message from Ra, which leaves Rd with
the following BGP table:
NEXT_HOP
AS_PATH MED IGP Cost
-----------------------
6 100 0 12
* 10 100 10 6
Rd selects the '10 100, 10, 6' path as best because of the IGP
metric. Rd sends an UPDATE/withdraw to its peers letting them
know this is the best route.
5) Ra receives the UPDATE message from Rd, which leaves Ra with
the following BGP table:
NEXT_HOP
AS_PATH MED IGP Cost
-----------------------
6 100 1 4
* 10 100 10 5
Ra received an UPDATE/withdraw for '6 100, 0, 13', which
changes what is considered the best route for Ra. This is why
Ra has the '10 100, 10, 5' route selected as best in Step 1,
even though '6 100, 1, 4' is actually better.
McPherson, et al. Informational [Page 6]
RFC 3345 BGP Persistent Route Oscillation Condition August 2002
At this point, we've made a full loop and are back at Step 1. The
router realizes it is using the incorrect best path, and repeats
the cycle. This is an example of Type I Churn when using Route
Reflection.
2.2. AS Confederations and Type I Churn
Now we provide an example of Type I Churn occurring with AS
Confederations. To begin, consider the topology depicted in Figure
2:
---------------------------------------------------------------
/ -------------------- -------------------- \
| / \ / \ |
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -