📄 error_handling.tex
字号:
If the \rtrmgr/\finder dies then all bets are off and all processesshould exit apart from the \xorpsh.If a \xorp process exits unexpectedly the \rtrmgr/\finder shouldattempt to restart the process.%%%%%%%%%%%%%%%%%%%%%%\subsubsection{FEA - Forwarding Engine Abstraction}The FEA primarily accepts routes from the RIB and places them in thekernel. The FEA should tag all routes that it has installed in thekernel. On restart, the FEA should remove all routes that a previousincarnation of the FEA has placed in the kernel. When an FEA isexiting it should attempt to remove all routes that it has installedin the kernel.The FEA process should register interest in the RIB. If the RIB failsthe FEA should withdraw all routes that the RIB has sent to it.%%%%%%%%%%%%%%%%%%%%%%\subsubsection{MFEA - Multicast Forwarding Engine Abstraction}The MFEA is multicast analogue to the unicast FEA. If should be notedthat typically the MFEA would be part of the FEA process.Similar to the FEA, on restart or exit the MFEA should remove allmulticast forwarding entries that were installed in the kernel. Notethat the MFEA does not contain a copy of the multicast forwarding entriesthat were installed in the kernel, so it should utilize amechanism that removes all multicast forwarding entries at once. Incase of UNIX-based systems, closing the multicast routing socket willautomatically remove all entries.If the multicast routing process that has installed the multicastforwarding entries exits, then the MFEA should remove all multicastforwarding entries from the kernel. Currently, PIM is the onlymulticast routing process. In the future, the XORP multicast routingarchitecture may contain a special coordinator among all multicastrouting protocol instances, analogous to the function of the unicastRIB process. If that coordinator exits, the MFEA should remove allmulticast forwarding entries from the kernel.%%%%%%%%%%%%%%%%%%%%%%\subsubsection{RIB - Routing Information Base}Routes from the routing processes are sent to the RIB; the winners aresent to the FEA.The RIB should register interest in the FEA. If the FEA fails the RIBshould exit. All routing processes that interact with the RIB should,on detecting the shutdown of the RIB, also terminate gracefully.%%%%%%%%%%%%%%%%%%%%%%\subsubsection{IGMP/MLD}If the FEA/MFEA process exits then this process should exit.%%%%%%%%%%%%%%%%%%%%%%\subsubsection{PIM}If the RIB or the FEA/MFEA process exits then this process should exit.%%%%%%%%%%%%%%%%%%%%%%\subsubsection{BGP}Currently the only other process in the system that BGP interactswith is the RIB. If the BGP process detects that the RIB has died thenit should gracefully terminate its sessions and exit.In the future the TCP connections that BGP makes will be mediatedthrough FEA, at which time the BGP process should also registerinterest in the state of the FEA. If the BGP process detects the deathof the FEA it should exit immediately.%%%%%%%%%%%%%%%%%%%%%%\subsubsection{RIP}The RIP process should register interest in the FEA and the RIB. Ifthe RIB dies then the RIP process should attempt to exit gracefully.If the FEA dies the RIP process should exit immediately.%%%%%%%%%%%%%%%%%%%%%%\subsubsection{IS-IS}The IS-IS process should register interest in the FEA and the RIB. Ifthe RIB dies then the IS-IS process should attempt to exit gracefully.If the FEA dies the IS-IS process should exit immediately.%%%%%%%%%%%%%%%%%%%%%%\subsubsection{OSPF}The OSPF process should register interest in the FEA and the RIB. Ifthe RIB dies then the OSPF process should attempt to exit gracefully.If the FEA dies the OSPF process should exit immediately.%%%%%%%%%%%%%%%%%%%%%%\subsubsection{\label{xorpsh}\xorpsh}The \xorpsh provides a command line interface to the XORP router.Other processes in the system exiting should never cause it toexit. The \rtrmgr/\finder process exiting should generatewarning output to the user and then the \xorpsh should wait for therouter to restart.%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\section{XRL Communication Errors}Interprocess communication in \xorp is achieved using XRLs. In thissection we will consider what should be done when an XRL call failsdue to a communication error.XRLs can be sent over unreliable transports such as UDP or reliabletransports such as TCP. The type of transport used is decided by theXRL library based on the specification of each interface. For thepurposes of error handling, the reliable and unreliable transports arethe same in all regards, except that reliable transports in XORP neverexplicitly report a timeout error.XRL communication is asynchronous: applications request the dispatchof an XRL and expect to have a callback invoked when the dispatchresult is available. This presents opportunities for immediate anddeferred error indications. Immediate error indications occur whenthe request for XRL dispatch is made: the canonical example occurringwhen no more buffer space is available within the XRL library isavailable. An application is able to detect these errorssynchronously: the dispatch request indicates an error in its returnvalue. Deferred error indications happen through the dispatchcallbacks. These callbacks are required to take an XrlError object asan argument. An XrlError object is comprised of an enumerated errorcode and an optional string containing specific information relatingto the error. The set of enumerated error codes is presented below.Immediate and deferred errors are exclusive. If the \xt dispatchingan XRL got an immediate error, it will not receive a callbackindicating a deferred error.%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\subsection*{Standard Dispatch XRL Error Values}The standard XRL return values are returned to the requesting \xt bythe dispatching \xt. When any of these values are returned, the XRLcommunication has been successful.\begin{description} \item [OKAY] XRL dispatch successful. Additional parameters in XRL callback contain return values. \item [COMMAND\_FAILED] XRL reached dispatcher, but could not be dispatched. The reason for failure may be specified in the note associated with the XrlError object. \item [BAD\_ARGS] XRL reached dispatcher, but argument types did not match those expected by the dispatcher.\end{description}%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\subsection*{Finder XRL Error Values}\begin{description} \item [NO\_FINDER] This error occurs when an \xt cannot communicate with the \finder. This always indicates a serious problem with the router, as the \finder should always be present. The application SHOULD treat this error as fatal. \item [RESOLVE\_FAILED] This error occurs when a \xt process tries to resolve an XRL the \finder has no result for. This may be because the target specified in the XRL does not exist or exists, but is still in the process of registering the XRL it exports. RESOLVE\_FAILED errors may happen because of a benign cause, namely that processes started up in a less than perfect order, so a target's user has initialized before the target itself. Applications SHOULD handle this type of transient RESOLVE\_FAILED error with a retransmission strategy. Applications may avoid this error by using the Finder event observer interface to detect when the particular target becomes ready. \item [NO\_SUCH\_METHOD] This error occurs when the named \xt is running and has registered it's XRLs, but it does not support the method named in the XRL. NO\_SUCH\_METHOD generally indicates a version mismatch between two processes. This error may be considered fatal, or (for example) the application might react by trying to access an older version of the interface. The application can expect, however, that NO\_SUCH\_METHOD errors are not transient: If an XRL access gets a NO\_SUCH\_METHOD error, then that XRL will always result in a NO\_SUCH\_METHOD error, at least until the target process restarts.\end{description}%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\subsection*{Transport and Internal Xrl Error Values}\begin{description} \item [SEND\_FAILED] The underlying XRL transport mechanism has failed. For example, the TCP connection has been reset, or a UDP connection gets a port-unreachable message. The expectation is that no further communication with the specific endpoint will succeed. \item [SEND\_FAILED\_TRANSIENT] This error occurs when the XRL library temporarily cannot send a particular XRL. Usually, this will be because of congestion or a slow receiver: the kernel has run out of buffer space. Note that the XRL library performs some buffering itself, to ensure that XRL requests are either completely transmitted or not transmitted at all. \emph{Note: The XRL library does not yet implement this error.} \item [REPLY\_TIMED\_OUT] -- The target did not reply within a transport-protocol-specific period of time. Possible reasons include network congestion, peer failure, network interface failure, and so on. As in all network communications, when a timeout occurs we don't know if the last unacknowledged XRL request was received and processed by the peer. This error occurs in unreliable transmit only.\end{description}%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%\subsection{Handling XRL Errors}XRLs may be directed to a class of target or a particular instance ofa target. The first instance of a target that registers with the\finder is considered to be the primary instance of its class and XRLsaddressed to that are directed to that instance. The XRL library MAYhide certain REPLY\_TIMED\_OUT and SEND\_FAILED errors for XRLsdirected towards classes, \ie should the instance which is acting asthe primary instance fail or exit, then another instance in thatclass, will receive the class directed XRL requests.%When communication with a particular%instance of the target fails, the XRL library thus MAY search for%another instance by contacting the \finder. The XRL library MUST NOT
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -