📄 rfc2201.txt
字号:
RFC 2201 CBT Multicast Routing Architecture September 1997
In the "back of the envelope" table below we compare the amount of
state required by CBT and DVMRP for different group sizes with
different numbers of active sources:
|--------------|---------------------------------------------------|
| Number of | | | |
| groups | 10 | 100 | 1000 |
====================================================================
| Group size | | | |
| (# members) | 20 | 40 | 60 |
-------------------------------------------------------------------|
| No. of srcs | | | | | | | | | |
| per group |10% | 50% |100% |10% | 50% |100% |10% | 50% | 100% |
--------------------------------------------------------------------
| No. of DVMRP | | | | | | | | | |
| router | | | | | | | | | |
| entries | 20 | 100 | 200 |400 | 2K | 4K | 6K | 30K | 60K |
--------------------------------------------------------------------
| No. of CBT | | | |
| router | | | |
| entries | 10 | 100 | 1000 |
|------------------------------------------------------------------|
Figure 1: Comparison of DVMRP and CBT Router State
Shared trees also incur significant bandwidth and state savings
compared with source trees; firstly, the tree only spans a group's
receivers (including links/routers leading to receivers) -- there is
no cost to routers/links in other parts of the network. Secondly,
routers between a non-member sender and the delivery tree are not
incurred any cost pertaining to multicast, and indeed, these routers
need not even be multicast-capable -- packets from non-member senders
are encapsulated and unicast to a core on the tree.
Ballardie Experimental [Page 6]
RFC 2201 CBT Multicast Routing Architecture September 1997
The figure below illustrates a core based tree.
b b b-----b
\ | |
\ | |
b---b b------b
/ \ / KEY....
/ \/
b X---b-----b X = Core
/ \ b = on-tree router
/ \
/ \
b b------b
/ \ |
/ \ |
b b b
Figure 2: CBT Tree
4. CBT - The New Architecture
4.1. Design Requirements
The CBT shared tree design was geared towards several design
objectives:
o scalability - the CBT designers decided not to sacrifice CBT's
O(G) scaling characteric to optimize delay using SPTs, as does
PIM. This was an important design decision, and one, we think,
was taken with foresight; once multicasting becomes ubiquitous,
router state maintenance will be a predominant scaling factor.
It is possible in some circumstances to improve/optimize the
delay of shared trees by other means. For example, a broadcast-
type lecture with a single sender (or limited set of
infrequently changing senders) could have its core placed in the
locality of the sender, allowing the CBT to emulate a shortest-
path tree (SPT) whilst still maintaining its O(G) scaling
characteristic. More generally, because CBT does not incur
source-specific state, it is particularly suited to many sender
applications.
o robustness - source-based tree algorithms are clearly robust; a
sender simply sends its data, and intervening routers "conspire"
to get the data where it needs to, creating state along the way.
This is the so-called "data driven" approach -- there is no
set-up protocol involved.
Ballardie Experimental [Page 7]
RFC 2201 CBT Multicast Routing Architecture September 1997
It is not as easy to achieve the same degree of robustness in
shared tree algorithms; a shared tree's core router maintains
connectivity between all group members, and is thus a single
point of failure. Protocol mechanisms must be present that
ensure a core failure is detected quickly, and the tree
reconnected quickly using a replacement core router.
o simplicity - the CBT protocol is relatively simple compared to
most other multicast routing protocols. This simplicity can lead
to enhanced performance compared to other protocols.
o interoperability - from a multicast perspective, the Internet is
a collection of heterogeneous multicast regions. The protocol
interconnecting these multicast regions is currently DVMRP [6];
any regions not running DVMRP connect to the DVMRP "backbone" as
stub regions. CBT has well-defined interoperability mechanisms
with DVMRP [15].
4.2. CBT Components & Functions
The CBT protocol is designed to build and maintain a shared multicast
distribution tree that spans only those networks and links leading to
interested receivers.
To achieve this, a host first expresses its interest in joining a
group by multicasting an IGMP host membership report [5] across its
attached link. On receiving this report, a local CBT aware router
invokes the tree joining process (unless it has already) by
generating a JOIN_REQUEST message, which is sent to the next hop on
the path towards the group's core router (how the local router
discovers which core to join is discussed in section 6). This join
message must be explicitly acknowledged (JOIN_ACK) either by the core
router itself, or by another router that is on the unicast path
between the sending router and the core, which itself has already
successfully joined the tree.
The join message sets up transient join state in the routers it
traverses, and this state consists of <group, incoming interface,
outgoing interface>. "Incoming interface" and "outgoing interface"
may be "previous hop" and "next hop", respectively, if the
corresponding links do not support multicast transmission. "Previous
hop" is taken from the incoming control packet's IP source address,
and "next hop" is gleaned from the routing table - the next hop to
the specified core address. This transient state eventually times out
unless it is "confirmed" with a join acknowledgement (JOIN_ACK) from
upstream. The JOIN_ACK traverses the reverse path of the
corresponding join message, which is possible due to the presence of
the transient join state. Once the acknowledgement reaches the
Ballardie Experimental [Page 8]
RFC 2201 CBT Multicast Routing Architecture September 1997
router that originated the join message, the new receiver can receive
traffic sent to the group.
Loops cannot be created in a CBT tree because a) there is only one
active core per group, and b) tree building/maintenance scenarios
which may lead to the creation of tree loops are avoided. For
example, if a router's upstream neighbour becomes unreachable, the
router immediately "flushes" all of its downstream branches, allowing
them to individually rejoin if necessary. Transient unicast loops do
not pose a threat because a new join message that loops back on
itself will never get acknowledged, and thus eventually times out.
The state created in routers by the sending or receiving of a
JOIN_ACK is bi-directional - data can flow either way along a tree
"branch", and the state is group specific - it consists of the group
address and a list of local interfaces over which join messages for
the group have previously been acknowledged. There is no concept of
"incoming" or "outgoing" interfaces, though it is necessary to be
able to distinguish the upstream interface from any downstream
interfaces. In CBT, these interfaces are known as the "parent" and
"child" interfaces, respectively.
With regards to the information contained in the multicast forwarding
cache, on link types not supporting native multicast transmission an
on-tree router must store the address of a parent and any children.
On links supporting multicast however, parent and any child
information is represented with local interface addresses (or similar
identifying information, such as an interface "index") over which the
parent or child is reachable.
When a multicast data packet arrives at a router, the router uses the
group address as an index into the multicast forwarding cache. A copy
of the incoming multicast data packet is forwarded over each
interface (or to each address) listed in the entry except the
incoming interface.
Each router that comprises a CBT multicast tree, except the core
router, is responsible for maintaining its upstream link, provided it
has interested downstream receivers, i.e. the child interface list is
not NULL. A child interface is one over which a member host is
directly attached, or one over which a downstream on-tree router is
attached. This "tree maintenance" is achieved by each downstream
router periodically sending a "keepalive" message (ECHO_REQUEST) to
its upstream neighbour, i.e. its parent router on the tree. One
keepalive message is sent to represent entries with the same parent,
thereby improving scalability on links which are shared by many
groups. On multicast capable links, a keepalive is multicast to the
"all-cbt-routers" group (IANA assigned as 224.0.0.15); this has a
Ballardie Experimental [Page 9]
RFC 2201 CBT Multicast Routing Architecture September 1997
suppressing effect on any other router for which the link is its
parent link. If a parent link does not support multicast
transmission, keepalives are unicast.
The receipt of a keepalive message over a valid child interface
immediately prompts a response (ECHO_REPLY), which is either unicast
or multicast, as appropriate.
The ECHO_REQUEST does not contain any group information; the
ECHO_REPLY does, but only periodically. To maintain consistent
information between parent and child, the parent periodically
reports, in a ECHO_REPLY, all groups for which it has state, over
each of its child interfaces for those groups. This group-carrying
echo reply is not prompted explicitly by the receipt of an echo
request message. A child is notified of the time to expect the next
echo reply message containing group information in an echo reply
prompted by a child's echo request. The frequency of parent group
reporting is at the granularity of minutes.
It cannot be assumed all of the routers on a multi-access link have a
uniform view of unicast routing; this is particularly the case when a
multi-access link spans two or more unicast routing domains. This
could lead to multiple upstream tree branches being formed (an error
condition) unless steps are taken to ensure all routers on the link
agree which is the upstream router for a particular group. CBT
routers attached to a multi-access link participate in an explicit
election mechanism that elects a single router, the designated router
(DR), as the link's upstream router for all groups. Since the DR
might not be the link's best next-hop for a particular core router,
this may result in join messages being re-directed back across a
multi-access link. If this happens, the re-directed join message is
unicast across the link by the DR to the best next-hop, thereby
preventing a looping scenario. This re-direction only ever applies
to join messages. Whilst this is suboptimal for join messages, which
are generated infrequently, multicast data never traverses a link
more than once (either natively, or encapsulated).
In all but the exception case described above, all CBT control
messages are multicast over multicast supporting links to the "all-
cbt-routers" group, with IP TTL 1. When a CBT control message is sent
over a non-multicast supporting link, it is explicitly addressed to
the appropriate next hop.
4.2.1. CBT Control Message Retransmission Strategy
Certain CBT control messages illicit a response of some sort. Lack of
response may be due to an upstream router crashing, or the loss of
the original message, or its response. To detect these events, CBT
Ballardie Experimental [Page 10]
RFC 2201 CBT Multicast Routing Architecture September 1997
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -