rfc3550.txt

来自「完整的RTP RTSP代码库」· 文本 代码 · 共 1,327 行 · 第 1/5 页

TXT
1,327
字号
   document are to be interpreted as described in BCP 14, RFC 2119 [2]   and indicate requirement levels for compliant RTP implementations.2. RTP Use Scenarios   The following sections describe some aspects of the use of RTP.  The   examples were chosen to illustrate the basic operation of   applications using RTP, not to limit what RTP may be used for.  In   these examples, RTP is carried on top of IP and UDP, and follows the   conventions established by the profile for audio and video specified   in the companion RFC 3551.Schulzrinne, et al.         Standards Track                     [Page 5]RFC 3550                          RTP                          July 20032.1 Simple Multicast Audio Conference   A working group of the IETF meets to discuss the latest protocol   document, using the IP multicast services of the Internet for voice   communications.  Through some allocation mechanism the working group   chair obtains a multicast group address and pair of ports.  One port   is used for audio data, and the other is used for control (RTCP)   packets.  This address and port information is distributed to the   intended participants.  If privacy is desired, the data and control   packets may be encrypted as specified in Section 9.1, in which case   an encryption key must also be generated and distributed.  The exact   details of these allocation and distribution mechanisms are beyond   the scope of RTP.   The audio conferencing application used by each conference   participant sends audio data in small chunks of, say, 20 ms duration.   Each chunk of audio data is preceded by an RTP header; RTP header and   data are in turn contained in a UDP packet.  The RTP header indicates   what type of audio encoding (such as PCM, ADPCM or LPC) is contained   in each packet so that senders can change the encoding during a   conference, for example, to accommodate a new participant that is   connected through a low-bandwidth link or react to indications of   network congestion.   The Internet, like other packet networks, occasionally loses and   reorders packets and delays them by variable amounts of time.  To   cope with these impairments, the RTP header contains timing   information and a sequence number that allow the receivers to   reconstruct the timing produced by the source, so that in this   example, chunks of audio are contiguously played out the speaker   every 20 ms.  This timing reconstruction is performed separately for   each source of RTP packets in the conference.  The sequence number   can also be used by the receiver to estimate how many packets are   being lost.   Since members of the working group join and leave during the   conference, it is useful to know who is participating at any moment   and how well they are receiving the audio data.  For that purpose,   each instance of the audio application in the conference periodically   multicasts a reception report plus the name of its user on the RTCP   (control) port.  The reception report indicates how well the current   speaker is being received and may be used to control adaptive   encodings.  In addition to the user name, other identifying   information may also be included subject to control bandwidth limits.   A site sends the RTCP BYE packet (Section 6.6) when it leaves the   conference.Schulzrinne, et al.         Standards Track                     [Page 6]RFC 3550                          RTP                          July 20032.2 Audio and Video Conference   If both audio and video media are used in a conference, they are   transmitted as separate RTP sessions.  That is, separate RTP and RTCP   packets are transmitted for each medium using two different UDP port   pairs and/or multicast addresses.  There is no direct coupling at the   RTP level between the audio and video sessions, except that a user   participating in both sessions should use the same distinguished   (canonical) name in the RTCP packets for both so that the sessions   can be associated.   One motivation for this separation is to allow some participants in   the conference to receive only one medium if they choose.  Further   explanation is given in Section 5.2.  Despite the separation,   synchronized playback of a source's audio and video can be achieved   using timing information carried in the RTCP packets for both   sessions.2.3 Mixers and Translators   So far, we have assumed that all sites want to receive media data in   the same format.  However, this may not always be appropriate.   Consider the case where participants in one area are connected   through a low-speed link to the majority of the conference   participants who enjoy high-speed network access.  Instead of forcing   everyone to use a lower-bandwidth, reduced-quality audio encoding, an   RTP-level relay called a mixer may be placed near the low-bandwidth   area.  This mixer resynchronizes incoming audio packets to   reconstruct the constant 20 ms spacing generated by the sender, mixes   these reconstructed audio streams into a single stream, translates   the audio encoding to a lower-bandwidth one and forwards the lower-   bandwidth packet stream across the low-speed link.  These packets   might be unicast to a single recipient or multicast on a different   address to multiple recipients.  The RTP header includes a means for   mixers to identify the sources that contributed to a mixed packet so   that correct talker indication can be provided at the receivers.   Some of the intended participants in the audio conference may be   connected with high bandwidth links but might not be directly   reachable via IP multicast.  For example, they might be behind an   application-level firewall that will not let any IP packets pass.   For these sites, mixing may not be necessary, in which case another   type of RTP-level relay called a translator may be used.  Two   translators are installed, one on either side of the firewall, with   the outside one funneling all multicast packets received through a   secure connection to the translator inside the firewall.  The   translator inside the firewall sends them again as multicast packets   to a multicast group restricted to the site's internal network.Schulzrinne, et al.         Standards Track                     [Page 7]RFC 3550                          RTP                          July 2003   Mixers and translators may be designed for a variety of purposes.  An   example is a video mixer that scales the images of individual people   in separate video streams and composites them into one video stream   to simulate a group scene.  Other examples of translation include the   connection of a group of hosts speaking only IP/UDP to a group of   hosts that understand only ST-II, or the packet-by-packet encoding   translation of video streams from individual sources without   resynchronization or mixing.  Details of the operation of mixers and   translators are given in Section 7.2.4 Layered Encodings   Multimedia applications should be able to adjust the transmission   rate to match the capacity of the receiver or to adapt to network   congestion.  Many implementations place the responsibility of rate-   adaptivity at the source.  This does not work well with multicast   transmission because of the conflicting bandwidth requirements of   heterogeneous receivers.  The result is often a least-common   denominator scenario, where the smallest pipe in the network mesh   dictates the quality and fidelity of the overall live multimedia   "broadcast".   Instead, responsibility for rate-adaptation can be placed at the   receivers by combining a layered encoding with a layered transmission   system.  In the context of RTP over IP multicast, the source can   stripe the progressive layers of a hierarchically represented signal   across multiple RTP sessions each carried on its own multicast group.   Receivers can then adapt to network heterogeneity and control their   reception bandwidth by joining only the appropriate subset of the   multicast groups.   Details of the use of RTP with layered encodings are given in   Sections 6.3.9, 8.3 and 11.3. Definitions   RTP payload: The data transported by RTP in a packet, for      example audio samples or compressed video data.  The payload      format and interpretation are beyond the scope of this document.   RTP packet: A data packet consisting of the fixed RTP header, a      possibly empty list of contributing sources (see below), and the      payload data.  Some underlying protocols may require an      encapsulation of the RTP packet to be defined.  Typically one      packet of the underlying protocol contains a single RTP packet,      but several RTP packets MAY be contained if permitted by the      encapsulation method (see Section 11).Schulzrinne, et al.         Standards Track                     [Page 8]RFC 3550                          RTP                          July 2003   RTCP packet: A control packet consisting of a fixed header part      similar to that of RTP data packets, followed by structured      elements that vary depending upon the RTCP packet type.  The      formats are defined in Section 6.  Typically, multiple RTCP      packets are sent together as a compound RTCP packet in a single      packet of the underlying protocol; this is enabled by the length      field in the fixed header of each RTCP packet.   Port: The "abstraction that transport protocols use to      distinguish among multiple destinations within a given host      computer.  TCP/IP protocols identify ports using small positive      integers." [12] The transport selectors (TSEL) used by the OSI      transport layer are equivalent to ports.  RTP depends upon the      lower-layer protocol to provide some mechanism such as ports to      multiplex the RTP and RTCP packets of a session.   Transport address: The combination of a network address and port      that identifies a transport-level endpoint, for example an IP      address and a UDP port.  Packets are transmitted from a source      transport address to a destination transport address.   RTP media type: An RTP media type is the collection of payload      types which can be carried within a single RTP session.  The RTP      Profile assigns RTP media types to RTP payload types.   Multimedia session: A set of concurrent RTP sessions among a      common group of participants.  For example, a videoconference      (which is a multimedia session) may contain an audio RTP session      and a video RTP session.   RTP session: An association among a set of participants      communicating with RTP.  A participant may be involved in multiple      RTP sessions at the same time.  In a multimedia session, each      medium is typically carried in a separate RTP session with its own      RTCP packets unless the the encoding itself multiplexes multiple      media into a single data stream.  A participant distinguishes      multiple RTP sessions by reception of different sessions using      different pairs of destination transport addresses, where a pair      of transport addresses comprises one network address plus a pair      of ports for RTP and RTCP.  All participants in an RTP session may      share a common destination transport address pair, as in the case      of IP multicast, or the pairs may be different for each      participant, as in the case of individual unicast network      addresses and port pairs.  In the unicast case, a participant may      receive from all other participants in the session using the same      pair of ports, or may use a distinct pair of ports for each.Schulzrinne, et al.         Standards Track                     [Page 9]RFC 3550                          RTP                          July 2003      The distinguishing feature of an RTP session is that each      maintains a full, separate space of SSRC identifiers (defined      next).  The set of participants included in one RTP session      consists of those that can receive an SSRC identifier transmitted      by any one of the participants either in RTP as the SSRC or a CSRC      (also defined below) or in RTCP.  For example, consider a three-      party conference implemented using unicast UDP with each      participant receiving from the other two on separate port pairs.      If each participant sends RTCP feedback about data received from      one other participant only back to that participant, then the      conference is composed of three separate point-to-point RTP      sessions.  If each participant provides RTCP feedback about its      reception of one other participant to both of the other      participants, then the conference is composed of one multi-party      RTP session.  The latter case simulates the behavior that would      occur with IP multicast communication among the three      participants.      The RTP framework allows the variations defined here, but a      particular control protocol or application design will usually      impose constraints on these variations.

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?