📄 rfc3016.txt
字号:
The resolution of the timestamp is set to its default value of 90kHz, unless specified by an out-of-band means (e.g., SDP parameter or MIME parameter as defined in section 5). Other header fields are used as described in RFC 1889 [8].3.2 Fragmentation of MPEG-4 Visual bitstream A fragmented MPEG-4 Visual bitstream is mapped directly onto the RTP payload without any addition of extra header fields or any removal of Visual syntax elements. The Combined Configuration/Elementary streams mode is used. The following rules apply for the fragmentation. In the following, header means one of the following: - Configuration information (Visual Object Sequence Header, Visual Object Header and Video Object Layer Header) - visual_object_sequence_end_code - The header of the entry point function for an elementary stream (Group_of_VideoObjectPlane() or the header of VideoObjectPlane(), video_plane_with_short_header(), MeshObject() or FaceObject()) - The video packet header (video_packet_header() excluding next_resync_marker()) - The header of gob_layer() See 6.2.1 "Start codes" of ISO/IEC 14496-2 [2][9][4] for the definition of the configuration information and the entry point functions. (1) Configuration information and Group_of_VideoObjectPlane() fields SHALL be placed at the beginning of the RTP payload (just after the RTP header) or just after the header of the syntactically upper layer function. (2) If one or more headers exist in the RTP payload, the RTP payload SHALL begin with the header of the syntactically highest function. Note: The visual_object_sequence_end_code is regarded as the lowest function.Kikuchi, et al. Standards Track [Page 6]RFC 3016 RTP Payload Format for MPEG-4 Audio/Visual November 2000 (3) A header SHALL NOT be split into a plurality of RTP packets. (4) Different VOPs SHOULD be fragmented into different RTP packets so that one RTP packet consists of the data bytes associated with a unique VOP time instance (that is indicated in the timestamp field in the RTP packet header), with the exception that multiple consecutive VOPs MAY be carried within one RTP packet in the decoding order if the size of the VOPs is small. Note: When multiple VOPs are carried in one RTP payload, the timestamp of the VOPs after the first one may be calculated by the decoder. This operation is necessary only for RTP packets in which the marker bit equals to one and the beginning of RTP payload corresponds to a start code. (See timestamp and marker bit in section 3.1.) (5) It is RECOMMENDED that a single video packet is sent as a single RTP packet. The size of a video packet SHOULD be adjusted in such a way that the resulting RTP packet is not larger than the path-MTU. Note: Rule (5) does not apply when the video packet is disabled by the coder configuration (by setting resync_marker_disable in the VOL header to 1), or in coding tools where the video packet is not supported. In this case, a VOP MAY be split at arbitrary byte- positions. The video packet starts with the VOP header or the video packet header, followed by motion_shape_texture(), and ends with next_resync_marker() or next_start_code().3.3 Examples of packetized MPEG-4 Visual bitstream Figure 2 shows examples of RTP packets generated based on the criteria described in 3.2 (a) is an example of the first RTP packet or the random access point of an MPEG-4 Visual bitstream containing the configuration information. According to criterion (1), the Visual Object Sequence Header(VS header) is placed at the beginning of the RTP payload, preceding the Visual Object Header and the Video Object Layer Header(VO header, VOL header). Since the fragmentation rule defined in 3.2 guarantees that the configuration information, starting with visual_object_sequence_start_code, is always placed at the beginning of the RTP payload, RTP receivers can detect the random access point by checking if the first 32-bit field of the RTP payload is visual_object_sequence_start_code.Kikuchi, et al. Standards Track [Page 7]RFC 3016 RTP Payload Format for MPEG-4 Audio/Visual November 2000 (b) is another example of the RTP packet containing the configuration information. It differs from example (a) in that the RTP packet also contains a video packet in the VOP following the configuration information. Since the length of the configuration information is relatively short (typically scores of bytes) and an RTP packet containing only the configuration information may thus increase the overhead, the configuration information and the immediately following GOV and/or (a part of) VOP can be packetized into a single RTP packet as in this example. (c) is an example of an RTP packet that contains Group_of_VideoObjectPlane(GOV). Following criterion (1), the GOV is placed at the beginning of the RTP payload. It would be a waste of RTP/IP header overhead to generate an RTP packet containing only a GOV whose length is 7 bytes. Therefore, (a part of) the following VOP can be placed in the same RTP packet as shown in (c). (d) is an example of the case where one video packet is packetized into one RTP packet. When the packet-loss rate of the underlying network is high, this kind of packetization is recommended. Even when the RTP packet containing the VOP header is discarded by a packet loss, the other RTP packets can be decoded by using the HEC(Header Extension Code) information in the video packet header. No extra RTP header field is necessary. (e) is an example of the case where more than one video packet is packetized into one RTP packet. This kind of packetization is effective to save the overhead of RTP/IP headers when the bit-rate of the underlying network is low. However, it will decrease the packet-loss resiliency because multiple video packets are discarded by a single RTP packet loss. The optimal number of video packets in an RTP packet and the length of the RTP packet can be determined considering the packet-loss rate and the bit-rate of the underlying network. (f) is an example of the case when the video packet is disabled by setting resync_marker_disable in the VOL header to 1. In this case, a VOP may be split into a plurality of RTP packets at arbitrary byte-positions. For example, it is possible to split a VOP into fixed-length packets. This kind of coder configuration and RTP packet fragmentation may be used when the underlying network is guaranteed to be error-free. On the other hand, it is not recommended to use it in error-prone environment since it provides only poor packet loss resiliency. Figure 3 shows examples of RTP packets prohibited by the criteria of 3.2.Kikuchi, et al. Standards Track [Page 8]RFC 3016 RTP Payload Format for MPEG-4 Audio/Visual November 2000 Fragmentation of a header into multiple RTP packets, as in (a), will not only increase the overhead of RTP/IP headers but also decrease the error resiliency. Therefore, it is prohibited by the criterion (3). When concatenating more than one video packets into an RTP packet, VOP header or video_packet_header() shall not be placed in the middle of the RTP payload. The packetization as in (b) is not allowed by criterion (2) due to the aspect of the error resiliency. Comparing this example with Figure 2(d), although two video packets are mapped onto two RTP packets in both cases, the packet-loss resiliency is not identical. Namely, if the second RTP packet is lost, both video packets 1 and 2 are lost in the case of Figure 3(b) whereas only video packet 2 is lost in the case of Figure 2(d). +------+------+------+------+(a) | RTP | VS | VO | VOL | |header|header|header|header| +------+------+------+------+ +------+------+------+------+------------+(b) | RTP | VS | VO | VOL |Video Packet| |header|header|header|header| | +------+------+------+------+------------+ +------+-----+------------------+(c) | RTP | GOV |Video Object Plane| |header| | | +------+-----+------------------+ +------+------+------------+ +------+------+------------+(d) | RTP | VOP |Video Packet| | RTP | VP |Video Packet| |header|header| (1) | |header|header| (2) | +------+------+------------+ +------+------+------------+ +------+------+------------+------+------------+------+------------+(e) | RTP | VP |Video Packet| VP |Video Packet| VP |Video Packet| |header|header| (1) |header| (2) |header| (3) | +------+------+------------+------+------------+------+------------+ +------+------+------------+ +------+------------+(f) | RTP | VOP |VOP fragment| | RTP |VOP fragment| |header|header| (1) | |header| (2) | ___ +------+------+------------+ +------+------------+ Figure 2 - Examples of RTP packetized MPEG-4 Visual bitstreamKikuchi, et al. Standards Track [Page 9]RFC 3016 RTP Payload Format for MPEG-4 Audio/Visual November 2000 +------+-------------+ +------+------------+------------+(a) | RTP |First half of| | RTP |Last half of|Video Packet| |header| VP header | |header| VP header | | +------+-------------+ +------+------------+------------+ +------+------+----------+ +------+---------+------+------------+(b) | RTP | VOP |First half| | RTP |Last half| VP |Video Packet| |header|header| of VP(1) | |header| of VP(1)|header| (2) | +------+------+----------+ +------+---------+------+------------+ Figure 3 - Examples of prohibited RTP packetization for MPEG-4 Visual bitstream4. RTP Packetization of MPEG-4 Audio bitstream This section specifies RTP packetization rules for MPEG-4 Audio bitstreams. MPEG-4 Audio streams MUST be formatted by LATM (Low- overhead MPEG-4 Audio Transport Multiplex) tool [5], and the LATM- based streams are then mapped onto RTP packets as described the three sections below.4.1 RTP Packet Format LATM-based streams consist of a sequence of audioMuxElements that include one or more audio frames. A complete audioMuxElement or a part of one SHALL be mapped directly onto an RTP payload without any removal of audioMuxElement syntax elements (see Figure 4). The first byte of each audioMuxElement SHALL be located at the first payload location in an RTP packet.Kikuchi, et al. Standards Track [Page 10]RFC 3016 RTP Payload Format for MPEG-4 Audio/Visual November 20000 1 2 30 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|V=2|P|X| CC |M| PT | sequence number |RTP+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| timestamp |Header+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| synchronization source (SSRC) identifier |+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+| contributing source (CSRC) identifiers || .... |+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+| |RTP: audioMuxElement (byte aligned) :Payload| || +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| :...OPTIONAL RTP padding |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4 - An RTP packet for MPEG-4 Audio In order to decode the audioMuxElement, the following muxConfigPresent information is required to be indicated by an out- of-band means. When SDP is utilized for this indication, MIME
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -