📄 rfc3040.txt
字号:
Network Working Group I. CooperRequest for Comments: 3040 Equinix, Inc.Category: Informational I. Melve UNINETT G. Tomlinson CacheFlow Inc. January 2001 Internet Web Replication and Caching TaxonomyStatus of this Memo This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.Copyright Notice Copyright (C) The Internet Society (2001). All Rights Reserved.Abstract This memo specifies standard terminology and the taxonomy of web replication and caching infrastructure as deployed today. It introduces standard concepts, and protocols used today within this application domain. Currently deployed solutions employing these technologies are presented to establish a standard taxonomy. Known problems with caching proxies are covered in the document titled "Known HTTP Proxy/Caching Problems", and are not part of this document. This document presents open protocols and points to published material for each protocol.Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 2.1 Base Terms . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 First order derivative terms . . . . . . . . . . . . . . . 6 2.3 Second order derivatives . . . . . . . . . . . . . . . . . 7 2.4 Topological terms . . . . . . . . . . . . . . . . . . . . 7 2.5 Automatic use of proxies . . . . . . . . . . . . . . . . . 8 3. Distributed System Relationships . . . . . . . . . . . . . 9 3.1 Replication Relationships . . . . . . . . . . . . . . . . 9 3.1.1 Client to Replica . . . . . . . . . . . . . . . . . . . . 9 3.1.2 Inter-Replica . . . . . . . . . . . . . . . . . . . . . . 9 3.2 Proxy Relationships . . . . . . . . . . . . . . . . . . . 10 3.2.1 Client to Non-Interception Proxy . . . . . . . . . . . . . 10Cooper, et al. Informational [Page 1]RFC 3040 Internet Web Replication & Caching Taxonomy January 2001 3.2.2 Client to Surrogate to Origin Server . . . . . . . . . . . 10 3.2.3 Inter-Proxy . . . . . . . . . . . . . . . . . . . . . . . 11 3.2.3.1 (Caching) Proxy Meshes . . . . . . . . . . . . . . . . . . 11 3.2.3.2 (Caching) Proxy Arrays . . . . . . . . . . . . . . . . . . 12 3.2.4 Network Element to Caching Proxy . . . . . . . . . . . . . 12 4. Replica Selection . . . . . . . . . . . . . . . . . . . . 13 4.1 Navigation Hyperlinks . . . . . . . . . . . . . . . . . . 13 4.2 Replica HTTP Redirection . . . . . . . . . . . . . . . . . 14 4.3 DNS Redirection . . . . . . . . . . . . . . . . . . . . . 14 5. Inter-Replica Communication . . . . . . . . . . . . . . . 15 5.1 Batch Driven Replication . . . . . . . . . . . . . . . . . 15 5.2 Demand Driven Replication . . . . . . . . . . . . . . . . 16 5.3 Synchronized Replication . . . . . . . . . . . . . . . . . 16 6. User Agent to Proxy Configuration . . . . . . . . . . . . 17 6.1 Manual Proxy Configuration . . . . . . . . . . . . . . . . 17 6.2 Proxy Auto Configuration (PAC) . . . . . . . . . . . . . . 17 6.3 Cache Array Routing Protocol (CARP) v1.0 . . . . . . . . . 18 6.4 Web Proxy Auto-Discovery Protocol (WPAD) . . . . . . . . . 18 7. Inter-Proxy Communication . . . . . . . . . . . . . . . . 19 7.1 Loosely coupled Inter-Proxy Communication . . . . . . . . 19 7.1.1 Internet Cache Protocol (ICP) . . . . . . . . . . . . . . 19 7.1.2 Hyper Text Caching Protocol . . . . . . . . . . . . . . . 20 7.1.3 Cache Digest . . . . . . . . . . . . . . . . . . . . . . . 21 7.1.4 Cache Pre-filling . . . . . . . . . . . . . . . . . . . . 22 7.2 Tightly Coupled Inter-Cache Communication . . . . . . . . 22 7.2.1 Cache Array Routing Protocol (CARP) v1.0 . . . . . . . . . 22 8. Network Element Communication . . . . . . . . . . . . . . 23 8.1 Web Cache Control Protocol (WCCP) . . . . . . . . . . . . 23 8.2 Network Element Control Protocol (NECP) . . . . . . . . . 24 8.3 SOCKS . . . . . . . . . . . . . . . . . . . . . . . . . . 25 9. Security Considerations . . . . . . . . . . . . . . . . . 25 9.1 Authentication . . . . . . . . . . . . . . . . . . . . . . 26 9.1.1 Man in the middle attacks . . . . . . . . . . . . . . . . 26 9.1.2 Trusted third party . . . . . . . . . . . . . . . . . . . 26 9.1.3 Authentication based on IP number . . . . . . . . . . . . 26 9.2 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . 26 9.2.1 Trusted third party . . . . . . . . . . . . . . . . . . . 26 9.2.2 Logs and legal implications . . . . . . . . . . . . . . . 27 9.3 Service security . . . . . . . . . . . . . . . . . . . . . 27 9.3.1 Denial of service . . . . . . . . . . . . . . . . . . . . 27 9.3.2 Replay attack . . . . . . . . . . . . . . . . . . . . . . 27 9.3.3 Stupid configuration of proxies . . . . . . . . . . . . . 28 9.3.4 Copyrighted transient copies . . . . . . . . . . . . . . . 28 9.3.5 Application level access . . . . . . . . . . . . . . . . . 28 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . 28 References . . . . . . . . . . . . . . . . . . . . . . . . 28 Authors' Addresses . . . . . . . . . . . . . . . . . . . . 31 Full Copyright Statement . . . . . . . . . . . . . . . . . 32Cooper, et al. Informational [Page 2]RFC 3040 Internet Web Replication & Caching Taxonomy January 20011. Introduction Since its introduction in 1990, the World-Wide Web has evolved from a simple client server model into a complex distributed architecture. This evolution has been driven largely due to the scaling problems associated with exponential growth. Distinct paradigms and solutions have emerged to satisfy specific requirements. Two core infrastructure components being employed to meet the demands of this growth are replication and caching. In many cases, there is a need for web caches and replicated services to be able to coexist. This memo specifies standard terminology and the taxonomy of web replication and caching infrastructure deployed in the Internet today. The principal goal of this document is to establish a common understanding and reference point of this application domain. It is also expected that this document will be used in the creation of a standard architectural framework for efficient, reliable, and predictable service in a web which includes both replicas and caches. Some of the protocols which this memo examines are specified only by company technical white papers or work in progress documents. Such references are included to demonstrate the existence of such protocols, their experimental deployment in the Internet today, or to aid the reader in their understanding of this technology area. There are many protocols, both open and proprietary, employed in web replication and caching today. A majority of the open protocols include DNS [8], Cache Digests [21][10], CARP [14], HTTP [1], ICP [2], PAC [12], SOCKS [7], WPAD [13], and WCCP [18][19]. These protocols, and their use within the caching and replication environments, are discussed below.2. Terminology The following terminology provides definitions of common terms used within the web replication and caching community. Base terms are taken, where possible, from the HTTP/1.1 specification [1] and are included here for reference. First- and second-order derivatives are constructed from these base terms to help define the relationships that exist within this area. Terms that are in common usage and which are contrary to definitions in RFC 2616 and this document are highlighted.Cooper, et al. Informational [Page 3]RFC 3040 Internet Web Replication & Caching Taxonomy January 20012.1 Base Terms The majority of these terms are taken as-is from RFC 2616 [1], and are included here for reference. client (taken from [1]) A program that establishes connections for the purpose of sending requests. server (taken from [1]) An application program that accepts connections in order to service requests by sending back responses. Any given program may be capable of being both a client and a server; our use of these terms refers only to the role being performed by the program for a particular connection, rather than to the program's capabilities in general. Likewise, any server may act as an origin server, proxy, gateway, or tunnel, switching behavior based on the nature of each request. proxy (taken from [1]) An intermediary program which acts as both a server and a client for the purpose of making requests on behalf of other clients. Requests are serviced internally or by passing them on, with possible translation, to other servers. A proxy MUST implement both the client and server requirements of this specification. A "transparent proxy" is a proxy that does not modify the request or response beyond what is required for proxy authentication and identification. A "non-transparent proxy" is a proxy that modifies the request or response in order to provide some added service to the user agent, such as group annotation services, media type transformation, protocol reduction, or anonymity filtering. Except where either transparent or non-transparent behavior is explicitly stated, the HTTP proxy requirements apply to both types of proxies. Note: The term "transparent proxy" refers to a semantically transparent proxy as described in [1], not what is commonly understood within the caching community. We recommend that the term "transparent proxy" is always prefixed to avoid confusion (e.g., "network transparent proxy"). However, see definition of "interception proxy" below. The above condition requiring implementation of both the server and client requirements of HTTP/1.1 is only appropriate for a non-network transparent proxy.Cooper, et al. Informational [Page 4]RFC 3040 Internet Web Replication & Caching Taxonomy January 2001 cache (taken from [1]) A program's local store of response messages and the subsystem that controls its message storage, retrieval, and deletion. A cache stores cacheable responses in order to reduce the response time and network bandwidth consumption on future, equivalent requests. Any client or server may include a cache, though a cache cannot be used by a server that is acting as a tunnel. Note: The term "cache" used alone often is meant as "caching proxy". Note: There are additional motivations for caching, for example reducing server load (as a further means to reduce response time). cacheable (taken from [1]) A response is cacheable if a cache is allowed to store a copy of the response message for use in answering subsequent requests. The rules for determining the cacheability of HTTP responses are defined in section 13. Even if a resource is cacheable, there may be additional constraints on whether a cache can use the cached copy for a particular request. gateway (taken from [1]) A server which acts as an intermediary for some other server. Unlike a proxy, a gateway receives requests as if it were the origin server for the requested resource; the requesting client may not be aware that it is communicating with a gateway. tunnel (taken from [1]) An intermediary program which is acting as a blind relay between two connections. Once active, a tunnel is not considered a party to the HTTP communication, though the tunnel may have been initiated by an HTTP request. The tunnel ceases to exist when both ends of the relayed connections are closed. replication "Creating and maintaining a duplicate copy of a database or file system on a different computer, typically a server." - Free Online Dictionary of Computing (FOLDOC) inbound/outbound (taken from [1]) Inbound and outbound refer to the request and response paths for messages: "inbound" means "traveling toward the origin server", and "outbound" means "traveling toward the user agent". network element A network device that introduces multiple paths between source and destination, transparent to HTTP.Cooper, et al. Informational [Page 5]RFC 3040 Internet Web Replication & Caching Taxonomy January 20012.2 First order derivative terms The following terms are constructed taking the above base terms as foundation. origin server (taken from [1]) The server on which a given resource resides or is to be created. user agent (taken from [1]) The client which initiates a request. These are often browsers, editors, spiders (web-traversing robots), or other end user tools. caching proxy A proxy with a cache, acting as a server to clients, and a client to servers. Caching proxies are often referred to as "proxy caches" or simply "caches". The term "proxy" is also frequently misused when referring to caching proxies. surrogate A gateway co-located with an origin server, or at a different point in the network, delegated the authority to operate on behalf of, and typically working in close co-operation with, one or more origin servers. Responses are typically delivered from an internal cache. Surrogates may derive cache entries from the origin server or from another of the origin server's delegates. In some cases a surrogate may tunnel such requests. Where close co-operation between origin servers and surrogates exists, this enables modifications of some protocol requirements, including the Cache-Control directives in [1]. Such modifications have yet to be fully specified. Devices commonly known as "reverse proxies" and "(origin) server accelerators" are both more properly defined as surrogates. reverse proxy See "surrogate". server accelerator See "surrogate".Cooper, et al. Informational [Page 6]RFC 3040 Internet Web Replication & Caching Taxonomy January 20012.3 Second order derivatives The following terms further build on first order derivatives: master origin server An origin server on which the definitive version of a resource resides. replica origin server An origin server holding a replica of a resource, but which may act as an authoritative reference for client requests. content consumer The user or system that initiates inbound requests, through use of a user agent. browser A special instance of a user agent that acts as a content
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -