📄 rfc3016.txt

📁 其中为本人做媒体项目时搜集的一些有关rtp和h264方面的资料.
💻 TXT
📖 第 1 页 / 共 4 页
字号:
   unless specified by an out-of-band means (e.g., SDP parameter or MIME
   parameter as defined in section 5).

   Other header fields are used as described in RFC 1889 [8].

3.2 Fragmentation of MPEG-4 Visual bitstream

   A fragmented MPEG-4 Visual bitstream is mapped directly onto the RTP
   payload without any addition of extra header fields or any removal of
   Visual syntax elements.  The Combined Configuration/Elementary
   streams mode is used.  The following rules apply for the
   fragmentation.

   In the following, header means one of the following:

   -  Configuration information (Visual Object Sequence Header, Visual
      Object Header and Video Object Layer Header)
   -  visual_object_sequence_end_code
   -  The header of the entry point function for an elementary stream
      (Group_of_VideoObjectPlane() or the header of VideoObjectPlane(),
      video_plane_with_short_header(), MeshObject() or FaceObject())
   -  The video packet header (video_packet_header() excluding
      next_resync_marker())
   -  The header of gob_layer()
      See 6.2.1 "Start codes" of ISO/IEC 14496-2 [2][9][4] for the
      definition of the configuration information and the entry point
      functions.

   (1) Configuration information and Group_of_VideoObjectPlane() fields
   SHALL be placed at the beginning of the RTP payload (just after the
   RTP header) or just after the header of the syntactically upper layer
   function.

   (2) If one or more headers exist in the RTP payload, the RTP payload
   SHALL begin with the header of the syntactically highest function.
   Note: The visual_object_sequence_end_code is regarded as the lowest
   function.




Kikuchi, et al.             Standards Track                     [Page 6]

RFC 3016       RTP Payload Format for MPEG-4 Audio/Visual  November 2000


   (3) A header SHALL NOT be split into a plurality of RTP packets.

   (4) Different VOPs SHOULD be fragmented into different RTP packets so
   that one RTP packet consists of the data bytes associated with a
   unique VOP time instance (that is indicated in the timestamp field in
   the RTP packet header), with the exception that multiple consecutive
   VOPs MAY be carried within one RTP packet in the decoding order if
   the size of the VOPs is small.

   Note: When multiple VOPs are carried in one RTP payload, the
   timestamp of the VOPs after the first one may be calculated by the
   decoder.  This operation is necessary only for RTP packets in which
   the marker bit equals to one and the beginning of RTP payload
   corresponds to a start code. (See timestamp and marker bit in section
   3.1.)

   (5) It is RECOMMENDED that a single video packet is sent as a single
   RTP packet.  The size of a video packet SHOULD be adjusted in such a
   way that the resulting RTP packet is not larger than the path-MTU.
   Note: Rule (5) does not apply when the video packet is disabled by
   the coder configuration (by setting resync_marker_disable in the VOL
   header to 1), or in coding tools where the video packet is not
   supported.  In this case, a VOP MAY be split at arbitrary byte-
   positions.

   The video packet starts with the VOP header or the video packet
   header, followed by motion_shape_texture(), and ends with
   next_resync_marker() or next_start_code().

3.3 Examples of packetized MPEG-4 Visual bitstream

   Figure 2 shows examples of RTP packets generated based on the
   criteria described in 3.2

   (a) is an example of the first RTP packet or the random access point
   of an MPEG-4 Visual bitstream containing the configuration
   information.  According to criterion (1), the Visual Object Sequence
   Header(VS header) is placed at the beginning of the RTP payload,
   preceding the Visual Object Header and the Video Object Layer
   Header(VO header, VOL header).  Since the fragmentation rule defined
   in 3.2 guarantees that the configuration information, starting with
   visual_object_sequence_start_code, is always placed at the beginning
   of the RTP payload, RTP receivers can detect the random access point
   by checking if the first 32-bit field of the RTP payload is
   visual_object_sequence_start_code.






Kikuchi, et al.             Standards Track                     [Page 7]

RFC 3016       RTP Payload Format for MPEG-4 Audio/Visual  November 2000


   (b) is another example of the RTP packet containing the configuration
   information.  It differs from example (a) in that the RTP packet also
   contains a video packet in the VOP following the configuration
   information.  Since the length of the configuration information is
   relatively short (typically scores of bytes) and an RTP packet
   containing only the configuration information may thus increase the
   overhead, the configuration information and the immediately following
   GOV and/or (a part of) VOP can be packetized into a single RTP packet
   as in this example.

   (c) is an example of an RTP packet that contains
   Group_of_VideoObjectPlane(GOV).  Following criterion (1), the GOV is
   placed at the beginning of the RTP payload.  It would be a waste of
   RTP/IP header overhead to generate an RTP packet containing only a
   GOV whose length is 7 bytes.  Therefore, (a part of) the following
   VOP can be placed in the same RTP packet as shown in (c).

   (d) is an example of the case where one video packet is packetized
   into one RTP packet.  When the packet-loss rate of the underlying
   network is high, this kind of packetization is recommended.  Even
   when the RTP packet containing the VOP header is discarded by a
   packet loss, the other RTP packets can be decoded by using the
   HEC(Header Extension Code) information in the video packet header.
   No extra RTP header field is necessary.

   (e) is an example of the case where more than one video packet is
   packetized into one RTP packet.  This kind of packetization is
   effective to save the overhead of RTP/IP headers when the bit-rate of
   the underlying network is low.  However, it will decrease the
   packet-loss resiliency because multiple video packets are discarded
   by a single RTP packet loss.  The optimal number of video packets in
   an RTP packet and the length of the RTP packet can be determined
   considering the packet-loss rate and the bit-rate of the underlying
   network.

   (f) is an example of the case when the video packet is disabled by
   setting resync_marker_disable in the VOL header to 1.  In this case,
   a VOP may be split into a plurality of RTP packets at arbitrary
   byte-positions.  For example, it is possible to split a VOP into
   fixed-length packets.  This kind of coder configuration and RTP
   packet fragmentation may be used when the underlying network is
   guaranteed to be error-free.  On the other hand, it is not
   recommended to use it in error-prone environment since it provides
   only poor packet loss resiliency.

   Figure 3 shows examples of RTP packets prohibited by the criteria of
   3.2.




Kikuchi, et al.             Standards Track                     [Page 8]

RFC 3016       RTP Payload Format for MPEG-4 Audio/Visual  November 2000


   Fragmentation of a header into multiple RTP packets, as in (a), will
   not only increase the overhead of RTP/IP headers but also decrease
   the error resiliency.  Therefore, it is prohibited by the criterion
   (3).

   When concatenating more than one video packets into an RTP packet,
   VOP header or video_packet_header() shall not be placed in the middle
   of the RTP payload.  The packetization as in (b) is not allowed by
   criterion (2) due to the aspect of the error resiliency.  Comparing
   this example with Figure 2(d), although two video packets are mapped
   onto two RTP packets in both cases, the packet-loss resiliency is not
   identical.  Namely, if the second RTP packet is lost, both video
   packets 1 and 2 are lost in the case of Figure 3(b) whereas only
   video packet 2 is lost in the case of Figure 2(d).

    +------+------+------+------+
(a) | RTP  |  VS  |  VO  | VOL  |
    |header|header|header|header|
    +------+------+------+------+

    +------+------+------+------+------------+
(b) | RTP  |  VS  |  VO  | VOL  |Video Packet|
    |header|header|header|header|            |
    +------+------+------+------+------------+

    +------+-----+------------------+
(c) | RTP  | GOV |Video Object Plane|
    |header|     |                  |
    +------+-----+------------------+

    +------+------+------------+  +------+------+------------+
(d) | RTP  | VOP  |Video Packet|  | RTP  |  VP  |Video Packet|
    |header|header|    (1)     |  |header|header|    (2)     |
    +------+------+------------+  +------+------+------------+

    +------+------+------------+------+------------+------+------------+
(e) | RTP  |  VP  |Video Packet|  VP  |Video Packet|  VP  |Video Packet|
    |header|header|     (1)    |header|    (2)     |header|    (3)     |
    +------+------+------------+------+------------+------+------------+

    +------+------+------------+  +------+------------+
(f) | RTP  | VOP  |VOP fragment|  | RTP  |VOP fragment|
    |header|header|    (1)     |  |header|    (2)     | ___
    +------+------+------------+  +------+------------+

     Figure 2 - Examples of RTP packetized MPEG-4 Visual bitstream





Kikuchi, et al.             Standards Track                     [Page 9]

RFC 3016       RTP Payload Format for MPEG-4 Audio/Visual  November 2000


    +------+-------------+  +------+------------+------------+
(a) | RTP  |First half of|  | RTP  |Last half of|Video Packet|
    |header|  VP header  |  |header|  VP header |            |
    +------+-------------+  +------+------------+------------+

    +------+------+----------+  +------+---------+------+------------+
(b) | RTP  | VOP  |First half|  | RTP  |Last half|  VP  |Video Packet|
    |header|header| of VP(1) |  |header| of VP(1)|header|    (2)     |
    +------+------+----------+  +------+---------+------+------------+

   Figure 3 - Examples of prohibited RTP packetization for MPEG-4 Visual
   bitstream

4. RTP Packetization of MPEG-4 Audio bitstream

   This section specifies RTP packetization rules for MPEG-4 Audio
   bitstreams.  MPEG-4 Audio streams MUST be formatted by LATM (Low-
   overhead MPEG-4 Audio Transport Multiplex) tool [5], and the LATM-
   based streams are then mapped onto RTP packets as described the three
   sections below.

4.1 RTP Packet Format

   LATM-based streams consist of a sequence of audioMuxElements that
   include one or more audio frames.  A complete audioMuxElement or a
   part of one SHALL be mapped directly onto an RTP payload without any
   removal of audioMuxElement syntax elements (see Figure 4).  The first
   byte of each audioMuxElement SHALL be located at the first payload
   location in an RTP packet.






















Kikuchi, et al.             Standards Track                    [Page 10]

RFC 3016       RTP Payload Format for MPEG-4 Audio/Visual  November 2000


0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X|  CC   |M|     PT      |       sequence number         |RTP
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           timestamp                           |Header
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           synchronization source (SSRC) identifier            |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
|            contributing source (CSRC) identifiers             |
|                             ....                              |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
|                                                               |RTP
:                 audioMuxElement (byte aligned)                :Payload
|                                                               |
|                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               :...OPTIONAL RTP padding        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

             Figure 4 - An RTP packet for MPEG-4 Audio

   In order to decode the audioMuxElement, the following
   muxConfigPresent information is required to be indicated by an out-
   of-band means.  When SDP is utilized for this indication, MIME
   parameter "cpresent" corresponds to the muxConfigPresent information
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -