📄 rfc1889.txt
字号:
RFC 1889 RTP January 1996
All header data is aligned to its natural length, i.e., 16-bit fields
are aligned on even offsets, 32-bit fields are aligned at offsets
divisible by four, etc. Octets designated as padding have the value
zero.
Wallclock time (absolute time) is represented using the timestamp
format of the Network Time Protocol (NTP), which is in seconds
relative to 0h UTC on 1 January 1900 [5]. The full resolution NTP
timestamp is a 64-bit unsigned fixed-point number with the integer
part in the first 32 bits and the fractional part in the last 32
bits. In some fields where a more compact representation is
appropriate, only the middle 32 bits are used; that is, the low 16
bits of the integer part and the high 16 bits of the fractional part.
The high 16 bits of the integer part must be determined
independently.
5. RTP Data Transfer Protocol
5.1 RTP Fixed Header Fields
The RTP header has the following format:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X| CC |M| PT | sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| synchronization source (SSRC) identifier |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| contributing source (CSRC) identifiers |
| .... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The first twelve octets are present in every RTP packet, while the
list of CSRC identifiers is present only when inserted by a mixer.
The fields have the following meaning:
version (V): 2 bits
This field identifies the version of RTP. The version defined by
this specification is two (2). (The value 1 is used by the first
draft version of RTP and the value 0 is used by the protocol
initially implemented in the "vat" audio tool.)
padding (P): 1 bit
If the padding bit is set, the packet contains one or more
additional padding octets at the end which are not part of the
Schulzrinne, et al Standards Track [Page 10]
RFC 1889 RTP January 1996
payload. The last octet of the padding contains a count of how
many padding octets should be ignored. Padding may be needed by
some encryption algorithms with fixed block sizes or for
carrying several RTP packets in a lower-layer protocol data
unit.
extension (X): 1 bit
If the extension bit is set, the fixed header is followed by
exactly one header extension, with a format defined in Section
5.3.1.
CSRC count (CC): 4 bits
The CSRC count contains the number of CSRC identifiers that
follow the fixed header.
marker (M): 1 bit
The interpretation of the marker is defined by a profile. It is
intended to allow significant events such as frame boundaries to
be marked in the packet stream. A profile may define additional
marker bits or specify that there is no marker bit by changing
the number of bits in the payload type field (see Section 5.3).
payload type (PT): 7 bits
This field identifies the format of the RTP payload and
determines its interpretation by the application. A profile
specifies a default static mapping of payload type codes to
payload formats. Additional payload type codes may be defined
dynamically through non-RTP means (see Section 3). An initial
set of default mappings for audio and video is specified in the
companion profile Internet-Draft draft-ietf-avt-profile, and
may be extended in future editions of the Assigned Numbers RFC
[6]. An RTP sender emits a single RTP payload type at any given
time; this field is not intended for multiplexing separate media
streams (see Section 5.2).
sequence number: 16 bits
The sequence number increments by one for each RTP data packet
sent, and may be used by the receiver to detect packet loss and
to restore packet sequence. The initial value of the sequence
number is random (unpredictable) to make known-plaintext attacks
on encryption more difficult, even if the source itself does not
encrypt, because the packets may flow through a translator that
does. Techniques for choosing unpredictable numbers are
discussed in [7].
timestamp: 32 bits
The timestamp reflects the sampling instant of the first octet
in the RTP data packet. The sampling instant must be derived
Schulzrinne, et al Standards Track [Page 11]
RFC 1889 RTP January 1996
from a clock that increments monotonically and linearly in time
to allow synchronization and jitter calculations (see Section
6.3.1). The resolution of the clock must be sufficient for the
desired synchronization accuracy and for measuring packet
arrival jitter (one tick per video frame is typically not
sufficient). The clock frequency is dependent on the format of
data carried as payload and is specified statically in the
profile or payload format specification that defines the format,
or may be specified dynamically for payload formats defined
through non-RTP means. If RTP packets are generated
periodically, the nominal sampling instant as determined from
the sampling clock is to be used, not a reading of the system
clock. As an example, for fixed-rate audio the timestamp clock
would likely increment by one for each sampling period. If an
audio application reads blocks covering 160 sampling periods
from the input device, the timestamp would be increased by 160
for each such block, regardless of whether the block is
transmitted in a packet or dropped as silent.
The initial value of the timestamp is random, as for the sequence
number. Several consecutive RTP packets may have equal timestamps if
they are (logically) generated at once, e.g., belong to the same
video frame. Consecutive RTP packets may contain timestamps that are
not monotonic if the data is not transmitted in the order it was
sampled, as in the case of MPEG interpolated video frames. (The
sequence numbers of the packets as transmitted will still be
monotonic.)
SSRC: 32 bits
The SSRC field identifies the synchronization source. This
identifier is chosen randomly, with the intent that no two
synchronization sources within the same RTP session will have
the same SSRC identifier. An example algorithm for generating a
random identifier is presented in Appendix A.6. Although the
probability of multiple sources choosing the same identifier is
low, all RTP implementations must be prepared to detect and
resolve collisions. Section 8 describes the probability of
collision along with a mechanism for resolving collisions and
detecting RTP-level forwarding loops based on the uniqueness of
the SSRC identifier. If a source changes its source transport
address, it must also choose a new SSRC identifier to avoid
being interpreted as a looped source.
CSRC list: 0 to 15 items, 32 bits each
The CSRC list identifies the contributing sources for the
payload contained in this packet. The number of identifiers is
given by the CC field. If there are more than 15 contributing
sources, only 15 may be identified. CSRC identifiers are
Schulzrinne, et al Standards Track [Page 12]
RFC 1889 RTP January 1996
inserted by mixers, using the SSRC identifiers of contributing
sources. For example, for audio packets the SSRC identifiers of
all sources that were mixed together to create a packet are
listed, allowing correct talker indication at the receiver.
5.2 Multiplexing RTP Sessions
For efficient protocol processing, the number of multiplexing points
should be minimized, as described in the integrated layer processing
design principle [1]. In RTP, multiplexing is provided by the
destination transport address (network address and port number) which
define an RTP session. For example, in a teleconference composed of
audio and video media encoded separately, each medium should be
carried in a separate RTP session with its own destination transport
address. It is not intended that the audio and video be carried in a
single RTP session and demultiplexed based on the payload type or
SSRC fields. Interleaving packets with different payload types but
using the same SSRC would introduce several problems:
1. If one payload type were switched during a session, there
would be no general means to identify which of the old
values the new one replaced.
2. An SSRC is defined to identify a single timing and sequence
number space. Interleaving multiple payload types would
require different timing spaces if the media clock rates
differ and would require different sequence number spaces
to tell which payload type suffered packet loss.
3. The RTCP sender and receiver reports (see Section 6.3) can
only describe one timing and sequence number space per SSRC
and do not carry a payload type field.
4. An RTP mixer would not be able to combine interleaved
streams of incompatible media into one stream.
5. Carrying multiple media in one RTP session precludes: the
use of different network paths or network resource
allocations if appropriate; reception of a subset of the
media if desired, for example just audio if video would
exceed the available bandwidth; and receiver
implementations that use separate processes for the
different media, whereas using separate RTP sessions
permits either single- or multiple-process implementations.
Using a different SSRC for each medium but sending them in the same
RTP session would avoid the first three problems but not the last
two.
Schulzrinne, et al Standards Track [Page 13]
RFC 1889 RTP January 1996
5.3 Profile-Specific Modifications to the RTP Header
The existing RTP data packet header is believed to be complete for
the set of functions required in common across all the application
classes that RTP might support. However, in keeping with the ALF
design principle, the header may be tailored through modifications or
additions defined in a profile specification while still allowing
profile-independent monitoring and recording tools to function.
o The marker bit and payload type field carry profile-specific
information, but they are allocated in the fixed header since
many applications are expected to need them and might otherwise
have to add another 32-bit word just to hold them. The octet
containing these fields may be redefined by a profile to suit
different requirements, for example with a more or fewer marker
bits. If there are any marker bits, one should be located in
the most significant bit of the octet since profile-independent
monitors may be able to observe a correlation between packet
loss patterns and the marker bit.
o Additional information that is required for a particular
payload format, such as a video encoding, should be carried in
the payload section of the packet. This might be in a header
that is always present at the start of the payload section, or
might be indicated by a reserved value in the data pattern.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -