📄 rfc1297.txt
字号:
Network Working Group D. JohnsonRequest for Comments: 1297 Merit Network, Inc. January 1992 NOC Internal Integrated Trouble Ticket System Functional Specification Wishlist ("NOC TT REQUIREMENTS")Status of the Memo This memo provides information for the Internet community. It does not specify an Internet standard. Distribution of this memo is unlimited.Abstract Professional quality handling of network problems requires some kind of problem tracking system, herein referred to as a "trouble ticket" system. A basic trouble ticket system acts like a hospital chart, coordinating the work of multiple people who may need to work on the problem. Once the basic trouble ticket system is in place, however, there are many extensions that can aid Network Operations efficiency. Information in the tickets can be used to produce statistical reports. Operator efficiency and accuracy may be increased by automating trouble ticket entry with information from the network Alert system. The Alert system may be used to monitor trouble ticket progress. Trouble tickets may be also used to communicate network health information between NOCs, to telcom vendors, and to other internal sales and engineering audiences. This document explores competing uses, architectures, and desirable features of integrated internal trouble ticket systems for Network and other Operations Centers.Introduction This RFC describes general functions of a Trouble Ticket system that could be designed for Network Operations Centers. The document is being distributed to members of the Internet community in order to stimulate discussions of new production-oriented operator-level application tools for network operations. Hopefully, this will result both in more ideas for improving NOC performance, and in more available tools that incorporate those ideas.Johnson [Page 1]RFC 1297 NOC TT REQUIREMENTS January 1992PURPOSES OF A NOC TROUBLE TICKET SYSTEM A good Network Operations Trouble Ticket System should serve many purposes: 1) SHORT-TERM MEMORY AND COMMUNICATION ("Hospital Chart"). The primary purpose of the trouble ticket system is to act as short- term memory about specific problems for the NOC as a whole. In a multi-operator or multi-shift NOC, calls and problem updates come in without regard to who worked last on a particular problem. Problems extend over shifts, and problems may be addressed by several different operators on the same shift. The trouble ticket (like a hospital chart) provides a complete history of the problem, so that any operator can come up to speed on a problem and take the next appropriate step without having to consult with other operators who are working on something else, or have gone home, or are on vacation. In single-room NOCs, an operator may ask out loud if someone else knows about or is working on a problem, but the system should allow for more formal communication as well. 2) SCHEDULING and WORK ASSIGNMENT. NOCs typically work with many simultaneous problems with different priorities. An on-line trouble ticket system can provide real time (or even constantly displayed and updated) lists of open problems, sorted by priority. This would allow operators to sort their work at the beginning of a shift, and to pick their next task during the shift. It also would allow supervisors and operators to keep track of the current NOC workload, and to call in and assign additional staff as appropriate. It may be useful to allow current priorities of tickets change according to time of day, or in response to timer alerts. 3) REFERRALS AND DISPATCHING. If the trouble ticket system is thoroughly enough integrated with a mail system, or if the system is used by Network Engineers as well as Network Operators, then some problems can be dispatched simply by placing the appropriate Engineer or Operator name in an "assigned to" field of the trouble ticket. 4) ALARM CLOCK. Typically, most of the time a trouble ticket is open, it is waiting for something to happen. There should almost always be a timer associated with every wait. If a ticket is referred to a phone company, there will be an escalation time before which the phone company is supposed to call back with an update on the problem. For tickets referred to remote site personnel, there may be other more arbitrary timeouts such asJohnson [Page 2]RFC 1297 NOC TT REQUIREMENTS January 1992 "Monday morning". Tickets referred to local engineers or programmers should also have timeouts ("Check in a couple of days if you don't hear back from me"). A good trouble ticket system will allow a timeout to be set for each ticket. This alarm will generate an alert for that ticket at the appropriate time. Preferably, the system should allow text to be attached to that timer with a shorthand message about what the alert involves ("Remind Site: TT xxx") (The full story can always be found by checking the trouble ticket). These alerts should feed into the NOC's standard alert system. The Alarm Clock can also assist (or enforce!) administrative escalation. An escalation timer could automatically be set based on the type of network, severity of the problem, and the time the outage occurred. 5) OVERSIGHT BY ENGINEERS AND CUSTOMER/SITE REPRESENTATIVES. NOCs frequently operate more than one network, or at least have people (engineers, customer representatives, etc) who are responsible for subsets of the total network. For these individual representatives, summaries of trouble tickets can be filtered by network or by node, and delivered electronically to the various engineers or site representatives. Each of these reports includes a summary of the previous day's trouble tickets for those sites, a listing of older trouble tickets still open, and a section listing recurrent problems. These reports allow the site reps to keep aware the current outages and trends for their particular sites. The trouble ticket system also allows network access to the the details of individual trouble tickets, so those receiving the general reports can get more detail on any of their problems by referencing the trouble ticket number. 6) STATISTICAL ANALYSIS. The fixed-form fields of trouble tickets allow categorizations of tickets, which are useful for analyzing equipment and NOC performance. These include, Mean Time Between Failure and Mean Time to Repair reports for specific equipment. The fields may also be of use for generating statistical quality control reports, which allow deteriorating equipment to be detected and serviced before it fails completely. Ticket breakdowns by network a NOC costs to be apportioned appropriately, and help in developing staffing and funding models. A good trouble ticket system should make this statistical information in a format suitable for spreadsheets and graphics programs. 7) FILTERING CURRENT ALERTS. It would be possible to use network status information from the trouble ticket system to filter the alerts that are displayed on the alert system. For instance, if node XXX is known to be down because the trouble ticket isJohnson [Page 3]RFC 1297 NOC TT REQUIREMENTS January 1992 currently open on it, the alert display for that node could automatically be acknowledged. Trouble tickets could potentially contain much further information useful for expert system analysis of current network alert information. 8) ACCOUNTABILITY ("CYA"), FACILITATING CUSTOMER FOLLOW-THROUGH, AND NOC IMAGE). Keeping user-complaint tickets facilities the kind of follow through with end-users that generates happy clients (and good NOC image) for normal trouble-fixing situations. But also, by their nature, NOCs deal with crises; they occasionally find themselves with major outages, and angry users or administrators. The trouble ticket system documents the NOC's (and the rest of the organization's) efforts to solve problems in case of complaints.FIXED FIELDS, FREE-FORM FIELDS, and TT CONFIGURATION Information in trouble tickets can be placed in either fixed or freeform fields. Fixed fields have the advantage that they can be used more easily for searches. A series of fixed fields also acts as a template, either encouraging or requiring the operators to fill in certain standard data. Fixed fields can facilitate data verification (e.g., making sure an entered name is in an attached contacts database, or verifying that a phone number consists of ten numeric characters). Fixed fields are also appropriate for data that is automatically entered by the system, such as the operator's login id, the name of the node that was clicked on if the trouble ticket is opened via an alert tool, or names and phone numbers that are automatically entered into the ticket based on other entries (e.g., filling in a contact name and phone based on a machine name). Unfortunately, fixed fields work best where the problem-debugging environment is uniform, well-understood, and stable; that is, trouble tickets work best when their fields are well tailored to the specific problem at hand. It is easy to set up a large number of fields (or even required fields) that are irrelevant to a given problem; this slows down and confuses the operators. Adding structure and validity checking to a field tends to make the data more consistent and reliable, but it also tends to force the operators into longer procedures like menus to get the get the data accepted by the system. It also forces there to be more maintenance on those verification systems (adding new entries as they become new legal options), and in some ways it reduces the accuracy of the system by forcing operators to choose "canned" or authorized responses that may not always represent the situation accurately. Where statistical operational reports are a primary purpose of the trouble ticket system, several fixed fields may be appropriate. If the primary intent of the system is to keep notes for individual problems and to facilitate
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -