rfc1297.txt
来自「RFC 的详细文档!」· 文本 代码 · 共 675 行 · 第 1/3 页
TXT
675 行
Network Working Group D. Johnson
Request for Comments: 1297 Merit Network, Inc.
January 1992
NOC Internal Integrated Trouble Ticket System
Functional Specification Wishlist
("NOC TT REQUIREMENTS")
Status of the Memo
This memo provides information for the Internet community. It does
not specify an Internet standard. Distribution of this memo is
unlimited.
Abstract
Professional quality handling of network problems requires some kind
of problem tracking system, herein referred to as a "trouble ticket"
system. A basic trouble ticket system acts like a hospital chart,
coordinating the work of multiple people who may need to work on the
problem.
Once the basic trouble ticket system is in place, however, there are
many extensions that can aid Network Operations efficiency.
Information in the tickets can be used to produce statistical
reports. Operator efficiency and accuracy may be increased by
automating trouble ticket entry with information from the network
Alert system. The Alert system may be used to monitor trouble ticket
progress. Trouble tickets may be also used to communicate network
health information between NOCs, to telcom vendors, and to other
internal sales and engineering audiences.
This document explores competing uses, architectures, and desirable
features of integrated internal trouble ticket systems for Network
and other Operations Centers.
Introduction
This RFC describes general functions of a Trouble Ticket system that
could be designed for Network Operations Centers. The document is
being distributed to members of the Internet community in order to
stimulate discussions of new production-oriented operator-level
application tools for network operations. Hopefully, this will
result both in more ideas for improving NOC performance, and in more
available tools that incorporate those ideas.
Johnson [Page 1]
RFC 1297 NOC TT REQUIREMENTS January 1992
PURPOSES OF A NOC TROUBLE TICKET SYSTEM
A good Network Operations Trouble Ticket System should serve many
purposes:
1) SHORT-TERM MEMORY AND COMMUNICATION ("Hospital Chart"). The
primary purpose of the trouble ticket system is to act as short-
term memory about specific problems for the NOC as a whole. In a
multi-operator or multi-shift NOC, calls and problem updates come
in without regard to who worked last on a particular problem.
Problems extend over shifts, and problems may be addressed by
several different operators on the same shift. The trouble ticket
(like a hospital chart) provides a complete history of the
problem, so that any operator can come up to speed on a problem
and take the next appropriate step without having to consult with
other operators who are working on something else, or have gone
home, or are on vacation. In single-room NOCs, an operator may
ask out loud if someone else knows about or is working on a
problem, but the system should allow for more formal communication
as well.
2) SCHEDULING and WORK ASSIGNMENT. NOCs typically work with many
simultaneous problems with different priorities. An on-line
trouble ticket system can provide real time (or even constantly
displayed and updated) lists of open problems, sorted by priority.
This would allow operators to sort their work at the beginning of
a shift, and to pick their next task during the shift. It also
would allow supervisors and operators to keep track of the current
NOC workload, and to call in and assign additional staff as
appropriate.
It may be useful to allow current priorities of tickets change
according to time of day, or in response to timer alerts.
3) REFERRALS AND DISPATCHING. If the trouble ticket system is
thoroughly enough integrated with a mail system, or if the system
is used by Network Engineers as well as Network Operators, then
some problems can be dispatched simply by placing the appropriate
Engineer or Operator name in an "assigned to" field of the trouble
ticket.
4) ALARM CLOCK. Typically, most of the time a trouble ticket is
open, it is waiting for something to happen. There should almost
always be a timer associated with every wait. If a ticket is
referred to a phone company, there will be an escalation time
before which the phone company is supposed to call back with an
update on the problem. For tickets referred to remote site
personnel, there may be other more arbitrary timeouts such as
Johnson [Page 2]
RFC 1297 NOC TT REQUIREMENTS January 1992
"Monday morning". Tickets referred to local engineers or
programmers should also have timeouts ("Check in a couple of days
if you don't hear back from me"). A good trouble ticket system
will allow a timeout to be set for each ticket. This alarm will
generate an alert for that ticket at the appropriate time.
Preferably, the system should allow text to be attached to that
timer with a shorthand message about what the alert involves
("Remind Site: TT xxx") (The full story can always be found by
checking the trouble ticket). These alerts should feed into the
NOC's standard alert system.
The Alarm Clock can also assist (or enforce!) administrative
escalation. An escalation timer could automatically be set based
on the type of network, severity of the problem, and the time the
outage occurred.
5) OVERSIGHT BY ENGINEERS AND CUSTOMER/SITE REPRESENTATIVES. NOCs
frequently operate more than one network, or at least have people
(engineers, customer representatives, etc) who are responsible for
subsets of the total network. For these individual
representatives, summaries of trouble tickets can be filtered by
network or by node, and delivered electronically to the various
engineers or site representatives. Each of these reports includes
a summary of the previous day's trouble tickets for those sites, a
listing of older trouble tickets still open, and a section listing
recurrent problems. These reports allow the site reps to keep
aware the current outages and trends for their particular sites.
The trouble ticket system also allows network access to the the
details of individual trouble tickets, so those receiving the
general reports can get more detail on any of their problems by
referencing the trouble ticket number.
6) STATISTICAL ANALYSIS. The fixed-form fields of trouble tickets
allow categorizations of tickets, which are useful for analyzing
equipment and NOC performance. These include, Mean Time Between
Failure and Mean Time to Repair reports for specific equipment.
The fields may also be of use for generating statistical quality
control reports, which allow deteriorating equipment to be
detected and serviced before it fails completely. Ticket
breakdowns by network a NOC costs to be apportioned appropriately,
and help in developing staffing and funding models. A good
trouble ticket system should make this statistical information in
a format suitable for spreadsheets and graphics programs.
7) FILTERING CURRENT ALERTS. It would be possible to use network
status information from the trouble ticket system to filter the
alerts that are displayed on the alert system. For instance, if
node XXX is known to be down because the trouble ticket is
Johnson [Page 3]
RFC 1297 NOC TT REQUIREMENTS January 1992
currently open on it, the alert display for that node could
automatically be acknowledged. Trouble tickets could potentially
contain much further information useful for expert system analysis
of current network alert information.
8) ACCOUNTABILITY ("CYA"), FACILITATING CUSTOMER FOLLOW-THROUGH,
AND NOC IMAGE). Keeping user-complaint tickets facilities the
kind of follow through with end-users that generates happy clients
(and good NOC image) for normal trouble-fixing situations. But
also, by their nature, NOCs deal with crises; they occasionally
find themselves with major outages, and angry users or
administrators. The trouble ticket system documents the NOC's
(and the rest of the organization's) efforts to solve problems in
case of complaints.
FIXED FIELDS, FREE-FORM FIELDS, and TT CONFIGURATION
Information in trouble tickets can be placed in either fixed or
freeform fields. Fixed fields have the advantage that they can be
used more easily for searches. A series of fixed fields also acts as
a template, either encouraging or requiring the operators to fill in
certain standard data. Fixed fields can facilitate data verification
(e.g., making sure an entered name is in an attached contacts
database, or verifying that a phone number consists of ten numeric
characters). Fixed fields are also appropriate for data that is
automatically entered by the system, such as the operator's login id,
the name of the node that was clicked on if the trouble ticket is
opened via an alert tool, or names and phone numbers that are
automatically entered into the ticket based on other entries (e.g.,
filling in a contact name and phone based on a machine name).
Unfortunately, fixed fields work best where the problem-debugging
environment is uniform, well-understood, and stable; that is, trouble
tickets work best when their fields are well tailored to the specific
problem at hand. It is easy to set up a large number of fields (or
even required fields) that are irrelevant to a given problem; this
slows down and confuses the operators. Adding structure and validity
checking to a field tends to make the data more consistent and
reliable, but it also tends to force the operators into longer
procedures like menus to get the get the data accepted by the system.
It also forces there to be more maintenance on those verification
systems (adding new entries as they become new legal options), and in
some ways it reduces the accuracy of the system by forcing operators
to choose "canned" or authorized responses that may not always
represent the situation accurately. Where statistical operational
reports are a primary purpose of the trouble ticket system, several
fixed fields may be appropriate. If the primary intent of the system
is to keep notes for individual problems and to facilitate
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?