rfc2343.txt
来自「RFC 的详细文档!」· 文本 代码 · 共 452 行 · 第 1/2 页
TXT
452 行
Network Working Group M. Civanlar
Request for Comments: 2343 G. Cash
Category: Experimental B. Haskell
AT&T Labs-Research
May 1998
RTP Payload Format for Bundled MPEG
Status of this Memo
This memo defines an Experimental Protocol for the Internet
community. This memo does not specify an Internet standard of any
kind. Discussion and suggestions for improvement are requested.
Distribution of this memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (1998). All Rights Reserved.
Abstract
This document describes a payload type for bundled, MPEG-2 encoded
video and audio data that may be used with RTP, version 2. Bundling
has some advantages for this payload type particularly when it is
used for video-on-demand applications. This payload type may be used
when its advantages are important enough to sacrifice the modularity
of having separate audio and video streams.
1. Introduction
This document describes a bundled packetization scheme for MPEG-2
encoded audio and video streams using the Real-time Transport
Protocol (RTP), version 2 [1].
The MPEG-2 International standard consists of three layers: audio,
video and systems [2]. The audio and the video layers define the
syntax and semantics of the corresponding "elementary streams." The
systems layer supports synchronization and interleaving of multiple
compressed streams, buffer initialization and management, and time
identification. RFC 2250 [3] describes packetization techniques to
transport individual audio and video elementary streams as well as
the transport stream, which is defined at the system layer, using the
RTP.
Civanlar, et. al. Experimental [Page 1]
RFC 2343 RTP Payload Format for Bundled MPEG May 1998
The bundled packetization scheme is needed because it has several
advantages over other schemes for some important applications
including video-on-demand (VOD) where, audio and video are always
used together. Its advantages over independent packetization of
audio and video are:
1. Uses a single port per "program" (i.e. bundled A/V). This may
increase the number of streams that can be served e.g., from a VOD
server. Also, it eliminates the performance hit when two ports are
used for the separate audio and video streams on the client side.
2. Provides implicit synchronization of audio and video. This is
particularly convenient when the A/V data is stored in an
interleaved format at the server.
3. Reduces the header overhead. Since using large packets increases
the effects of losses and delay, audio only packets need to be
smaller increasing the overhead. An A/V bundled format can provide
about 1% overall overhead reduction. Considering the high bitrates
used for MPEG-2 encoded material, e.g. 4 Mbps, the number of bits
saved, e.g. 40 Kbps, may provide noticeable audio or video quality
improvement.
4. May reduce overall receiver buffer size. Audio and video streams
may experience different delays when transmitted separately. The
receiver buffers need to be designed for the longest of these
delays. For example, let's assume that using two buffers, each with
a size B, is sufficient with probability P when each stream is
transmitted individually. The probability that the same buffer size
will be sufficient when both streams need to be received is P times
the conditional probability of B being sufficient for the second
stream given that it was sufficient for the first one. This
conditional probability is, generally, less than one requiring use
of a larger buffer size to achieve the same probability level.
5. May help with the control of the overall bandwidth used by an
A/V program.
And, the advantages over packetization of the transport layer streams
are:
1. Reduced overhead. It does not contain systems layer information
which is redundant for the RTP (essentially they address similar
issues).
Civanlar, et. al. Experimental [Page 2]
RFC 2343 RTP Payload Format for Bundled MPEG May 1998
2. Easier error recovery. Because of the structured packetization
consistent with the application layer framing (ALF) principle, loss
concealment and error recovery can be made simpler and more
effective.
2. Encapsulation of Bundled MPEG Video and Audio
Video encapsulation follows rules similar to the ones described in
[3] for encapsulation of MPEG elementary streams. Specifically,
1. The MPEG Video_Sequence_Header, when present, will always be at
the beginning of an RTP payload.
2. An MPEG GOP_header, when present, will always be at the
beginning of the RTP payload, or will follow a
Video_Sequence_Header.
3. An MPEG Picture_Header, when present, will always be at the
beginning of a RTP payload, or will follow a GOP_header.
In addition to these, it is required that:
4. Each packet must contain an integral number of video slices.
It is the application's responsibility to adjust the slice sizes and
the number of slices put in each RTP packet so that lower level
fragmentation does not occur. This approach simplifies the receivers
while somewhat increasing the complexity of the transmitter's
packetizer. Considering that a slice can be as small as a single
macroblock, it is possible to prevent fragmentation for most of the
cases. If a packet size exceeds the path maximum transmission unit
(path-MTU) [4], this payload type depends on the lower protocol
layers for fragmentation although, this may cause problems with
packet classification for integrated services (e.g. with RSVP).
The video data is followed by a sufficient number of integral audio
frames to cover the duration of the video segment included in a
packet. For example, if the first packet contains three 1/900
seconds long slices of video, and Layer I audio coding is used at a
44.1kHz sampling rate, only one audio frame covering 384/44100
seconds of audio need be included in this packet. Since the length of
this audio frame (8.71 msec.) is longer than that of the video
segment contained in this packet (3.33 msec), the next few packets
may not contain any audio frames until the packet in which the
covered video time extends outside the length of the previously
transmitted audio frames. Alternatively, it is possible, in this
proposal, to repeat the latest audio frame in "no-audio" packets for
Civanlar, et. al. Experimental [Page 3]
RFC 2343 RTP Payload Format for Bundled MPEG May 1998
packet loss resilience. Again, it is the application's responsibility
to adjust the bundled packet size according to the minimum MTU size
to prevent fragmentation.
2.1. RTP Fixed Header for BMPEG Encapsulation
The following RTP header fields are used:
Payload Type: A distinct payload type number, which may be dynamic,
should be assigned to BMPEG.
M Bit: Set for packets containing end of a picture.
timestamp: 32-bit 90 kHz timestamp representing sampling time of
the MPEG picture. May not be monotonically increasing if B pictures
are present. Same for all packets belonging to the same picture.
For packets that contain only a sequence, extension and/or GOP
header, the timestamp is that of the subsequent picture.
2.2. BMPEG Specific Header:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| P |N|MBZ| Audio Length | | Audio Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
MBZ
P: Picture type (2 bits). I (0), P (1), B (2).
N: Header data changed (1 bit). Set if any part of the video
sequence, extension, GOP and picture header data is different than
that of the previously sent headers. It gets reset when all the
header data gets repeated (see Appendix 1).
MBZ: Must be zero. Reserved for future use.
Audio Length: (10 bits) Length of the audio data in this packet in
bytes. Start of the audio data is found by subtracting "Audio
Length" from the total length of the received packet.
Audio Offset: (16 bits) The offset between the start of the audio
frame and the RTP timestamp for this packet in number of audio
samples (for multi-channel sources, a set of samples covering all
channels is counted as one sample for this purpose.)
Civanlar, et. al. Experimental [Page 4]
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?