📄 rfc1889.txt

📁 RFC关于RTP实时传输协议的详细规范。原理并不复杂
💻 TXT
📖 第 1 页 / 共 5 页
字号:
RFC 1889                          RTP                       January 1996


   All header data is aligned to its natural length, i.e., 16-bit fields
   are aligned on even offsets, 32-bit fields are aligned at offsets
   divisible by four, etc. Octets designated as padding have the value
   zero.

   Wallclock time (absolute time) is represented using the timestamp
   format of the Network Time Protocol (NTP), which is in seconds
   relative to 0h UTC on 1 January 1900 [5]. The full resolution NTP
   timestamp is a 64-bit unsigned fixed-point number with the integer
   part in the first 32 bits and the fractional part in the last 32
   bits. In some fields where a more compact representation is
   appropriate, only the middle 32 bits are used; that is, the low 16
   bits of the integer part and the high 16 bits of the fractional part.
   The high 16 bits of the integer part must be determined
   independently.

5.  RTP Data Transfer Protocol

5.1 RTP Fixed Header Fields

      The RTP header has the following format:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |V=2|P|X|  CC   |M|     PT      |       sequence number         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                           timestamp                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           synchronization source (SSRC) identifier            |
   +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
   |            contributing source (CSRC) identifiers             |
   |                             ....                              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   The first twelve octets are present in every RTP packet, while the
   list of CSRC identifiers is present only when inserted by a mixer.
   The fields have the following meaning:

   version (V): 2 bits
        This field identifies the version of RTP. The version defined by
        this specification is two (2). (The value 1 is used by the first
        draft version of RTP and the value 0 is used by the protocol
        initially implemented in the "vat" audio tool.)

   padding (P): 1 bit
        If the padding bit is set, the packet contains one or more
        additional padding octets at the end which are not part of the



Schulzrinne, et al          Standards Track                    [Page 10]

RFC 1889                          RTP                       January 1996


        payload. The last octet of the padding contains a count of how
        many padding octets should be ignored. Padding may be needed by
        some encryption algorithms with fixed block sizes or for
        carrying several RTP packets in a lower-layer protocol data
        unit.

   extension (X): 1 bit
        If the extension bit is set, the fixed header is followed by
        exactly one header extension, with a format defined in Section
        5.3.1.

   CSRC count (CC): 4 bits
        The CSRC count contains the number of CSRC identifiers that
        follow the fixed header.

   marker (M): 1 bit
        The interpretation of the marker is defined by a profile. It is
        intended to allow significant events such as frame boundaries to
        be marked in the packet stream. A profile may define additional
        marker bits or specify that there is no marker bit by changing
        the number of bits in the payload type field (see Section 5.3).

   payload type (PT): 7 bits
        This field identifies the format of the RTP payload and
        determines its interpretation by the application. A profile
        specifies a default static mapping of payload type codes to
        payload formats. Additional payload type codes may be defined
        dynamically through non-RTP means (see Section 3). An initial
        set of default mappings for audio and video is specified in the
        companion profile Internet-Draft draft-ietf-avt-profile, and
        may be extended in future editions of the Assigned Numbers RFC
        [6].  An RTP sender emits a single RTP payload type at any given
        time; this field is not intended for multiplexing separate media
        streams (see Section 5.2).

   sequence number: 16 bits
        The sequence number increments by one for each RTP data packet
        sent, and may be used by the receiver to detect packet loss and
        to restore packet sequence. The initial value of the sequence
        number is random (unpredictable) to make known-plaintext attacks
        on encryption more difficult, even if the source itself does not
        encrypt, because the packets may flow through a translator that
        does. Techniques for choosing unpredictable numbers are
        discussed in [7].

   timestamp: 32 bits
        The timestamp reflects the sampling instant of the first octet
        in the RTP data packet. The sampling instant must be derived



Schulzrinne, et al          Standards Track                    [Page 11]

RFC 1889                          RTP                       January 1996


        from a clock that increments monotonically and linearly in time
        to allow synchronization and jitter calculations (see Section
        6.3.1).  The resolution of the clock must be sufficient for the
        desired synchronization accuracy and for measuring packet
        arrival jitter (one tick per video frame is typically not
        sufficient).  The clock frequency is dependent on the format of
        data carried as payload and is specified statically in the
        profile or payload format specification that defines the format,
        or may be specified dynamically for payload formats defined
        through non-RTP means. If RTP packets are generated
        periodically, the nominal sampling instant as determined from
        the sampling clock is to be used, not a reading of the system
        clock. As an example, for fixed-rate audio the timestamp clock
        would likely increment by one for each sampling period.  If an
        audio application reads blocks covering 160 sampling periods
        from the input device, the timestamp would be increased by 160
        for each such block, regardless of whether the block is
        transmitted in a packet or dropped as silent.

   The initial value of the timestamp is random, as for the sequence
   number. Several consecutive RTP packets may have equal timestamps if
   they are (logically) generated at once, e.g., belong to the same
   video frame. Consecutive RTP packets may contain timestamps that are
   not monotonic if the data is not transmitted in the order it was
   sampled, as in the case of MPEG interpolated video frames. (The
   sequence numbers of the packets as transmitted will still be
   monotonic.)

   SSRC: 32 bits
        The SSRC field identifies the synchronization source. This
        identifier is chosen randomly, with the intent that no two
        synchronization sources within the same RTP session will have
        the same SSRC identifier. An example algorithm for generating a
        random identifier is presented in Appendix A.6. Although the
        probability of multiple sources choosing the same identifier is
        low, all RTP implementations must be prepared to detect and
        resolve collisions.  Section 8 describes the probability of
        collision along with a mechanism for resolving collisions and
        detecting RTP-level forwarding loops based on the uniqueness of
        the SSRC identifier. If a source changes its source transport
        address, it must also choose a new SSRC identifier to avoid
        being interpreted as a looped source.

   CSRC list: 0 to 15 items, 32 bits each
        The CSRC list identifies the contributing sources for the
        payload contained in this packet. The number of identifiers is
        given by the CC field. If there are more than 15 contributing
        sources, only 15 may be identified. CSRC identifiers are



Schulzrinne, et al          Standards Track                    [Page 12]

RFC 1889                          RTP                       January 1996


        inserted by mixers, using the SSRC identifiers of contributing
        sources. For example, for audio packets the SSRC identifiers of
        all sources that were mixed together to create a packet are
        listed, allowing correct talker indication at the receiver.

5.2 Multiplexing RTP Sessions

   For efficient protocol processing, the number of multiplexing points
   should be minimized, as described in the integrated layer processing
   design principle [1]. In RTP, multiplexing is provided by the
   destination transport address (network address and port number) which
   define an RTP session. For example, in a teleconference composed of
   audio and video media encoded separately, each medium should be
   carried in a separate RTP session with its own destination transport
   address. It is not intended that the audio and video be carried in a
   single RTP session and demultiplexed based on the payload type or
   SSRC fields. Interleaving packets with different payload types but
   using the same SSRC would introduce several problems:

        1.   If one payload type were switched during a session, there
             would be no general means to identify which of the old
             values the new one replaced.

        2.   An SSRC is defined to identify a single timing and sequence
             number space. Interleaving multiple payload types would
             require different timing spaces if the media clock rates
             differ and would require different sequence number spaces
             to tell which payload type suffered packet loss.

        3.   The RTCP sender and receiver reports (see Section 6.3) can
             only describe one timing and sequence number space per SSRC
             and do not carry a payload type field.

        4.   An RTP mixer would not be able to combine interleaved
             streams of incompatible media into one stream.

        5.   Carrying multiple media in one RTP session precludes: the
             use of different network paths or network resource
             allocations if appropriate; reception of a subset of the
             media if desired, for example just audio if video would
             exceed the available bandwidth; and receiver
             implementations that use separate processes for the
             different media, whereas using separate RTP sessions
             permits either single- or multiple-process implementations.

   Using a different SSRC for each medium but sending them in the same
   RTP session would avoid the first three problems but not the last
   two.



Schulzrinne, et al          Standards Track                    [Page 13]

RFC 1889                          RTP                       January 1996


5.3 Profile-Specific Modifications to the RTP Header

   The existing RTP data packet header is believed to be complete for
   the set of functions required in common across all the application
   classes that RTP might support. However, in keeping with the ALF
   design principle, the header may be tailored through modifications or
   additions defined in a profile specification while still allowing
   profile-independent monitoring and recording tools to function.

        o The marker bit and payload type field carry profile-specific
         information, but they are allocated in the fixed header since
         many applications are expected to need them and might otherwise
         have to add another 32-bit word just to hold them. The octet
         containing these fields may be redefined by a profile to suit
         different requirements, for example with a more or fewer marker
         bits. If there are any marker bits, one should be located in
         the most significant bit of the octet since profile-independent
         monitors may be able to observe a correlation between packet
         loss patterns and the marker bit.

        o Additional information that is required for a particular
         payload format, such as a video encoding, should be carried in
         the payload section of the packet. This might be in a header
         that is always present at the start of the payload section, or
         might be indicated by a reserved value in the data pattern.
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -