📄 bgp.tex
字号:
an alternative path is now the winner. As there can be many thousandsof routes in a RibIn, this process can take some time.\item When a new peering comes up, all the winning routes must be sentto that peer. This can also take some time.\item When the IGP information related to a BGP nexthop changes, allthe routes that specify this nexthop must be re-evaluated to see ifthe change affects the choice of route. In BGP it is fairly commonfor a very large number of routes to share the same BGP nexthop, sothis re-evaluation can take some time.\end{itemize}The XORP BGP process cannot simply process such events to completion -in particular it must keep processing XRL requests from otherprocesses or the IPC mechanism may declare a failure. In any event,it is important that a single slow peer cannot cause route forwardingbetween other peers to stall. Thus we process the events above as``background tasks''. In a multithreaded architecture, such tasks might be separate threads,but the locking issues soon become very complex. In our singlethreaded architecture, there are no complex locking issues, but thebackground nature of such tasks needs to be explicitly coded. We dothis by dividing the background task into small enough segments. Atcompletion of such a segment we schedule a zero-second timer toschedule execution of the next segment, and drop back to the mainevent loop. Execution will then be restarted after pending networkevents and expired timers have been processed by the main event loop.Care must of course be taken to ensure that when execution returns,the processing of events or other background tasks has not renderedincorrect the state the background task needed to restart. However,as the processing of each background task segment is naturally atomicin a single-threaded architecture, there are fewer possibilities forbad interactions. Even so, the state stored by these background tasksto enable their correct restart involves some rather complexalgorithms.\subsection{DeletionTable Class}When a peering goes down, the routing table stored in the RibIn ismoved to a DeletionTable that is plumbed in directly after the RibIn,as shown in figure \ref{del_table}.\begin{figure}[htb]\centerline{\includegraphics[width=0.7\textwidth]{figs/del_table}}\vspace{.05in}\caption{\label{del_table}Dynamic insertion of Deletion Table onPeering Failure}\end{figure}The task of the deletion table is to delete all the routes that werepreviously stored in the RibIn, but to do so as a background taskallowing the BGP process to continue to handle new events.Deletion is scheduled in series of phases. In a single phase, all theroutes that share a single set of Path Attributes are deleted. Inthis way, if there are alternative routes in a different RibIn thatalso share a path attribute list (a fairly common occurrence), then thechance of preferred route may be batched in such a way that it might bepossible to use a single update message might convey the change toeach neighbor. At the end of a phase, the DeletionTable schedulesexecution of the next deletion phase using a zero-second timer. Thisallows all pending timer or network events to be handled beforedeletion resumes.The DeletionTable must respond to {\tt lookup\_route} requests fromdownstream just as a RibIn table would - even though the deletiontable knows the routes it holds will be deleted, it must respond as ifthey had not yet been deleted until it has had a chance to send therelevant {\tt delete\_route} message downstream. In this way, theDeletionTable provides a consistent view to the downstream tables - ifthey earlier performed a {\tt lookup\_route} and got a specific answer,then they will still get the same answer unless they have received a{\tt delete\_route} or {\tt replace\_route} informing them of achange.When the last route is deleted from the DeletionTable, the tableunplumbs itself from the BGP plumbing, and deletes itself.A small complication is added by the possibility that the peeringmight come back up before the DeletionTable has finished deleting allthe old routes. Thus if the DeletionTable receives an {\ttadd\_route} from upstream for a route that in present in theDeletionTable, then this route would be passed on downstream as a {\ttreplace\_route}, and the route would then immediately be removed fromthe DeletionTable. {\tt add\_route}, {\tt replace\_route} and {\ttdelete\_route} for routes not present in the DeletionTable are simplypassed from parent to child without modification.Should the peering come up and go down again before all the routes inthe first DeletionTable have been deleted, a second deletion tablewould be inserted before the first one. By virtue of the normalfunctioning of the DeletionTable, if there are two such cascadedDeletionTables, then they will not hold the same route, so this doesnot add any additional complication.\subsection{DumpTable Class}When a peering comes up, all the currently winning routes from theother peers must be sent to the new peer. This process is managed byan instance of the DumpTable class, which is inserted between thefanout table and the first table on the output branch to the peer thatcame up (see Figure \ref{dump_table}).\begin{figure}[htb]\centerline{\includegraphics[width=0.6\textwidth]{figs/dump_table}}\vspace{.05in}\caption{\label{dump_table}Dynamic insertion of DumpTable ona peering coming up.}\end{figure}A DumpTable is perhaps the most complex part of the BGP machinery.While the dump is taking place, it must cope with:\begin{itemize}\item New routing information being propagated.\item Other peers going down and coming back, possibly repeatedly.\end{itemize}It must do this without propagating inconsistent information downstream,such as a {\tt replace\_route} or a {\tt delete\_route} withoutsending an {\tt add\_route} first. It must ensure that all the routeswhat passed decision are dumped to the new peer. And it must cope whenthe routes just before and just after the most recent route it dumpedare deleted, without losing track of where it is in the dump process.The process is complex enough to merit description here in detail.Each DumpTable contains a DumpIterator instance which holds all thestate related to the current state of that particular route dump. TheDumpIterator contains the following state:\begin{itemize}\item A list of the remaining peers to dump, initialized at the startof the dump. Peers are removed from this list when the dump of theroutes received from that peer is complete, or when the peer does downduring the dump. Peers are never added to this list during the dump.\item A list of the peers that went down before we'd finished dumpingthem. Along with each peer is stored enough state to know how farwe'd got though the route dump from that peer before it went down.\item The current peer whose routes are being dumped.\item A trie iterator pointing to the next RibIn trie entry to be dumpedon the current peer (we call this the Route Iterator)\item The last subnet dumped from the current peer.\item A flag that indicates whether the Route Iterator is currentlyvalid.\item The last GenID seen. The GenID of the RibIn is incremented eachtime the peering comes up.\end{itemize}At startup the DumpTable initialized the list of remaining peers inthe DumpIterator to be the set of peers that are currently up. Thenit iterates through this list dumping all the routes from each RibInin turn.The DumpTable calls {\tt dump\_next\_route} on its parent table, passing theDumpIterator as a parameter. The {\tt dump\_next\_route} request is relayedupstream to the DecisionTable. DecisionTable relays the request tothe input branch of the first peer listed in the list of remainingpeers, and from there it is relayed back to the relevant RibIn.To allow routes in the RibIn to be deleted during the route dumpwithout the Route Iterator becoming invalid, we use a special trie andtrie iterator in RibIn, where the route in the trie will not actuallybe deleted until no trie iterator is pointing at it. This isimplemented using reference counts in the Trie nodes themselves.{\tt dump\_next\_route} in the RibIn checks the Route Iterator to seeif it points to the end of the Trie. If the previous call to {\ttdump\_next\_route} had left the Route Iterator pointing to a node that hassubsequently been deleted, this comparison transparently causes theRoute Iterator to move on to the next non-deleted node, or to the endof the Trie if there are no subsequent deleted nodes. This updatingof the Route Iterator is transparent to the user of the iterator.The route pointed to by the Route Iterator is then propated downstreamfrom the RibIn to the DumpTable as a {\tt route\_dump} call. TheDumpTable turns this into an {\tt add\_route} call which it propagatesdownstream to the new peer.At the end of {\tt dump\_next\_route}, the RibIn increments theRouteIterator ready for next time. DumpTable then schedules the next call to {\tt dump\_next\_route} using azero-second timer to allow other pending events to be processed.On returning from the timer, the DumpTable checks to see if any routechanges have been queued upstream in the FanoutTable due to outputflow control. If so, it processes all these route changes beforedumping the next route. This is necessary, or the DumpTable will notbe able to tell which changes it needs to propagate downstream becausewe've already passed their location in the dump process, and which areunnecessary because we will get round to dumping them eventually.In general, a routechange needs to be propagated downstream if:\begin{itemize}\item It comes from a peer that is not in our remaining peers list.\item It comes from the peer currently being dumped, but its route isbefore the location of the route in the DumpIterator.\item It comes from a peer that went down, and its subnet is before thesubnet we had reached while dumping that peer's RibIn when the peerwent down.\item It comes from a peer that went down, and the RibIn GenID laterthan that peers GenID was when it went down. This would happenbecause the peering has since come back up, and is now injecting newroutes.\end{itemize}When all the routes in the RibIn of a particular peer have beendumped, that peer is removed from the remaining peers list in theDumpIterator, and the next {\tt dump\_next\_route} will be sent to the RibInof the next peer in the list.When there are no peers in the remaining peers list, the dump iscomplete. The DumpTable then unplumbs itself from the plumbing, anddeletes itself.Note that at any time there may be multiple DumpTables in operation,each dumping to a different peer. All the dump state is held in theDumpIterators, so this does not cause any problem.\end{document}
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -