📄 rfc1224.txt
字号:
RFC 1224 Managing Asynchronously Generated Alerts May 19916.1.1 Example In a sample system (based on the example in Appendix A), a manager must monitor 40 remote agents, each having between 2 and 15 parameters which indicate the relative health of the agent and the network. During normal monitoring, the manager is concerned only with fault detection. With an average poll request-response time of 5 seconds, the manager polls one MIB variable on each node. This involves one request and one reply packet of the format specified in the XYZ network management protocol. Each packet requires 120 bytes "on the wire" (requesting a single object, ASN.1 encoded, IP and UDP enveloped, and placed in an ethernet packet). This results in a serial poll cycle time of 3.3 minutes (40 nodes at 5 seconds each is 200 seconds), and a mean time to detect alert of slightly over 1.5 minutes. The total amount of data transferred during a 3.3 minute poll cycle is 9600 bytes (120 requests and 120 replies for each of 40 nodes). With such a small amount of network management traffic per minute, the poll rate might reasonably be doubled (assuming the network performance permits it). The result is 19200 bytes transferred per cycle, and a mean time to detect failure of under 1 minute. Parallel polling obviously yields similar improvements. Should an alert be returned by a remote agent's log, the manager notifies the operator and removes the element from the alert log by setting it with SNMP or deleting it with CMOT. Normal alert detection procedures are then followed. Those SNMP implementers who prefer to not use SNMP SET for table entry deletes may always define their log as "read only". The fact that the manager made a single query (to the log) and was able to determine which, if any, objects merited special attention essentially means that the status of all alert capable objects was monitored with a single request. Continuing the above example, should a remote entity fail to respond to two successive poll attempts, the operator is notified that the agent is not reachable. The operator may then choose (if so equipped) to contact the agent through an alternate path (such as serial line IP over a dial up modem). Upon establishing such a connection, the manager may then retrieve the contents of the alert log for a chronological map of the failure's alerts. Alerts undelivered because of conditions that may no longer be present are still available for analysis.6.2 Notes on Polled, Logged Alerts Polled, logged alert techniques allow the tracking of many alerts while actually monitoring only a single MIB object. This dramatically decreases the amount of network management data that must flow across the network to determine the status. By reducingSteinberg [Page 12]RFC 1224 Managing Asynchronously Generated Alerts May 1991 the number of requests needed to track multiple objects (to one), the poll cycle time is greatly improved. This allows a faster poll cycle (mean time to detect alert) with less overhead than would be caused by pure polling. In addition, this technique scales well to large networks, as the concept of polling a single object to learn the status of many lends itself well to hierarchies. A proxy manager may be polled to learn if he has found any alerts in the logs of the agents he polls. Of course, this scaling does not save on the mean time to learn of an alert (the cycle times of the manager and the proxy manager must be considered), but the amount of network management polling traffic is concentrated at lower levels. Only a small amount of such traffic need be passed over the network's "backbone"; that is the traffic generated by the request-response from the manager to the proxy managers. Note that it is best to return the oldest logged alert as the first table entry. This is the object most likely to be overwritten, and every attempt should be made ensure that the manager has seen it. In a system where log entries may be removed by the manager, the manager will probably wish to attempt to keep all remote alert logs empty to reduce the number of alerts dropped or overwritten. In any case, the order in which table entries are returned is a function of the table mechanism, and is implementation and/or protocol specific. "Polled, logged alerts" offers all of the advantages inherent in polling (reliable detection of failures, reduced agent complexity with UDP, etc.), while minimizing the typical polling problems (potentially shorter poll cycle time and reduced network management traffic). Finally, alerts are not lost when an agent is isolated from its manager. When a connection is reestablished, a history of conditions that may no longer be in effect is available to the manager. While not a part of this document, it is worthwhile to note that this same log architecture can be employed to archive alert and other information on remote hosts. However, such non-local storage is not sufficient to meet the reliability requirements of "polled, logged alerts".Steinberg [Page 13]RFC 1224 Managing Asynchronously Generated Alerts May 19917. Compatibility with SNMP [4] and CMOT [3]7.1 Closed Loop (Feedback) Alert Reporting7.1.1 Use of Feedback with SNMP At configuration time, an SNMP agent supporting Feedback/Pin is loaded with default values of "windowTime" and "maxAlerts-PerTime", and "alertsEnabled" is set to TRUE. The manager issues an SNMP GET to determine "maxAlertsPerTime" and "windowTime", and to verify the state of "alertsEnabled". Should the agent support setting Pin objects, the manager may choose to alter these values (via an SNMP SET). The new values are calculated based upon known network resource limitations (e.g., the amount of packets the manager's gateway can support) and the number of agents potentially reporting to this manager. Upon receipt of an "alertsDisabled" trap, a manager whose state and network are not overutilized immediately issues an SNMP SET to make "alertsEnabled" TRUE. Should an excessive number of "alertsDisabled" traps regularly occur, the manager might revisit the values chosen for implementing the Pin mechanism. Note that an overutilized system expects its manager to delay the resetting of "alertsEnabled". As a part of each regular polling cycle, the manager includes a GET REQUEST for the value of "alertsEnabled". If this value is FALSE, it is SET to TRUE, and the potential loss of traps (while it was FALSE) is noted.7.1.2 Use of Feedback with CMOT The use of CMOT in implementing Feedback/Pin is essentially identical to the use of SNMP. CMOT GET, SET, and EVENT replace their SNMP counterparts.7.2 Polled, Logged Alerts7.2.1 Use of Polled, Logged alerts with SNMP As a part of regular polling, an SNMP manager using Polled, logged alerts may issue a GET_NEXT Request naming { alertLog logTableEntry(1) alertId(1) 0 }. Returned is either the alertId of the first table entry or, if the table is empty, an SNMP reply whose object is the "lexicographical successor" to the alert log. Should an "alertId" be returned, the manager issues an SNMP GET naming { alertLog logTableEntry(1) alertData(2) value } where "value"Steinberg [Page 14]RFC 1224 Managing Asynchronously Generated Alerts May 1991 is the alertId integer obtained from the previously described GET NEXT. This returns the SNMP TRAP encapsulated within an OPAQUE. If the agent supports the deletion of table entries through SNMP SETS, the manager may then issue a SET of { alertLog logTableEntry(1) alertId(1) value } to remove the entry from the log. Otherwise, the next GET NEXT poll of this agent should request the first "alertId" following the instance of "value" rather than an instance of "0".7.2.2 Use of Polled, Logged Alerts with CMOT Using polled, logged alerts with CMOT is similar to using them with SNMP. In order to test for table entries, one uses a CMOT GET and specifies scoping to the alertLog. The request is for all table entries that have an alertId value greater than the last known alertId, or greater than zero if the table is normally kept empty by the manager. Should the agent support it, entries are removed with a CMOT DELETE, an object of alertLog.entry, and a distinguishing attribute of the alertId to remove.8. Multiple Manager Environments The conflicts between multiple managers with overlapping administrative domains (generally found in larger networks) tend to be resolved in protocol specific manners. This document has not addressed them. However, real world demands require alert management techniques to function in such environments. Complex agents can clearly respond to different managers (or managers in different "communities") with different reply values. This allows feedback and polled, logged alerts to appear completely independent to differing autonomous regions (each region sees its own value). Differing feedback thresholds might exist, and feedback can be actively blocking alerts to one manager even after another manager has reenabled its own alert reporting. All of this is transparent to an SNMP user if based on communities, or each manager can work with a different copy of the relevant MIB objects. Those implementing CMOT might view these as multiple instances of the same feedback objects (and allow one manager to query the state of another's feedback mechanism). The same holds true for polled, logged alerts. One manager (or manager in a single community/region) can delete an alert from its view without affecting the view of another region's managers. Those preferring less complex agents will recognize the opportunity to instrument proxy management. Alerts might be distributed from a manager based alert exploder which effectively implements feedbackSteinberg [Page 15]RFC 1224 Managing Asynchronously Generated Alerts May 1991 and polled, logged alerts for its subscribers. Feedback parameters are set on each agent to the highest rate of any subscriber, and limited by the distributor. Logged alerts are deleted from the view at the proxy manager, and truly deleted at the agent only when all subscribers have so requested, or immediately deleted at the agent with the first proxy request, and maintained as virtual entries by the proxy manager for the benefit of other subscribers.9. Summary While "polled, logged alerts" may be useful, they still have a limitation: the mean time to detect failures and alerts increases linearly as networks grow in size (hierarchies offer shorten individual poll cycle times, but the mean detection time is the sum of 1/2 of each cycle time). For this reason, it may be necessary to supplement asynchronous generation of alerts (and "polled, logged alerts") with unrequested transmission of the alerts on very large networks. Whenever systems generate and asynchronously transmit alerts, the potential to overburden (over-inform) a management station exists. Mechanisms to protect a manager, such as the "Feedback/Pin" technique, risk losing potentially important information. Failure to implement asynchronous alerts increases the time for the manager to detect and react to a problem. Over-reporting may appear less critical (and likely) a problem than under-informing, but the potential for harm exists with unbounded alert generation. An ideal management system will generate alerts to notify its management station (or stations) of error conditions. However, these alerts must be self limiting with required positive feedback. In addition, the manager should periodically poll to ensure connectivity to remote stations, and to retrieve copies of any alerts that were not delivered by the network.10. References [1] Rose, M., and K. McCloghrie, "Structure and Identification of Management Information for TCP/IP-based Internets", RFC 1155, Performance Systems International and Hughes LAN Systems, May 1990. [2] McCloghrie, K., and M. Rose, "Management Information Base for Network Management of TCP/IP-based internets", RFC 1213, Hughes LAN Systems, Inc., Performance Systems International, March 1991. [3] Warrier, U., Besaw, L., LaBarre, L., and B. Handspicker, "Common Management Information Services and Protocols for the InternetSteinberg [Page 16]RFC 1224 Managing Asynchronously Generated Alerts May 1991 (CMOT) and (CMIP)", RFC 1189, Netlabs, Hewlett-Packard, The Mitre Corporation, Digital Equipment Corporation, October 1990. [4] Case, J., Fedor, M., Schoffstall, M., and C. Davin, "Simple Network Management Protocol" RFC 1157, SNMP Research, Performance Systems International, Performance Systems International, MIT Laboratory for Computer Science, May 1990. [5] Reynolds, J., and J. Postel, "Assigned Numbers", RFC 1060, USC/Information Sciences Institute, March 1990.11. Acknowledgements This memo is the product of work by the members of the IETF Alert-Man Working Group and other interested parties, whose efforts are gratefully acknowledged here: Amatzia Ben-Artzi Synoptics Communications Neal Bierbaum Vitalink Corp. Jeff Case University of Tennessee at Knoxville John Cook Chipcom Corp. James Davin MIT Mark Fedor Performance Systems International, Inc. Steven Hunter Lawrence Livermore National Labs Frank Kastenholz Clearpoint Research
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -