📄 icpv2-application.txt
字号:
Network Working Group D. WesselsInternet-Draft K. Claffy National Laboratory for AppliedObsoletes <draft-wessels-icp-v2-appl-02.txt> Network Research/UCSDExpires: 8 January 1998 8 July 1997 Application of Internet Cache Protocol (ICP), version 2 <draft-wessels-icp-v2-appl-03.txt>Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.'' To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast).Abstract This document describes the application of ICPv2 (Internet Cache Protocol version 2, RFCXXXX) to Web caching. ICPv2 is a lightweight message format used for communication among Web caches. Several independent caching implementations now use ICP[3,5], making it important to codify the existing practical uses of ICP for those trying to implement, deploy, and extend its use. ICP queries and replies refer to the existence of URLs (or objects) in neighbor caches. Caches exchange ICP messages and use the gathered information to select the most appropriate location from which to retrieve an object. A companion document (RFCXXXX) describes the format and syntax of the protocol itself. In this document we focus on issues of ICP deployment, efficiency, security, and interaction with other aspects of Web traffic behavior.Wessels & Claffy [Page 1]Internet-Draft 8 Jul 1997Table of Contents 1. Introduction................................................. 2 2. Web Cache Hierarchies........................................ 3 3. What is the Added Value of ICP?.............................. 5 4. Example Configuration of ICP Hierarchy....................... 5 4.1. Configuring the `proxy.customer.org' cache................. 6 4.2. Configuring the `cache.isp.com' cache...................... 6 5. Applying the Protocol........................................ 7 5.1. Sending ICP Queries........................................ 8 5.2. Receiving ICP Queries and Sending Replies.................. 10 5.3. Receiving ICP Replies...................................... 11 5.4. ICP Options................................................ 13 6. Firewalls.................................................... 14 7. Multicast.................................................... 15 8. Lessons Learned.............................................. 16 8.1. Differences Between ICP and HTTP........................... 16 8.2. Parents, Siblings, Hits and Misses......................... 16 8.3. Different Roles of ICP..................................... 17 8.4. Protocol Design Flaws of ICPv2............................. 17 9. Security Considerations...................................... 18 9.1. Inserting Bogus ICP Queries................................ 19 9.2. Inserting Bogus ICP Replies................................ 19 9.3. Eavesdropping.............................................. 20 9.4. Blocking ICP Messages...................................... 20 9.5. Delaying ICP Messages...................................... 20 9.6. Denial of Service.......................................... 20 9.7. Altering ICP Fields........................................ 21 9.8. Summary.................................................... 22 10. References................................................... 23 11. Acknowledgments.............................................. 24 12. Author's Addresses........................................... 241. Introduction ICP is a lightweight message format used for communicating among Web caches. ICP is used to exchange hints about the existence of URLs in neighbor caches. Caches exchange ICP queries and replies to gather information for use in selecting the most appropriate location from which to retrieve an object. This document describes the implementation of ICP in software. For a description of the protocol and message format, please refer to the companion document (RFCXXXX). We avoid making judgments about whether or how ICP should be used in particular Web caching configu- rations. ICP may be a "net win" in some situations, and a "net loss" in others. We recognize that certain practices described in thisWessels & Claffy [Page 2]Internet-Draft 8 Jul 1997 document are suboptimal. Some of these exist for historical reasons. Some aspects have been improved in later versions. Since this docu- ment only serves to describe current practices, we focus on document- ing rather than evaluating. However, we do address known security problems and other shortcomings. The remainder of this document is written as follows. We first describe Web cache hierarchies, explain motivation for using ICP, and demonstrate how to configure its use in cache hierarchies. We then provide a step-by-step description of an ICP query-response transac- tion. We then discuss ICP interaction with firewalls, and briefly touch on multicasting ICP. We end with lessons with have learned during the protocol development and deployement thus far, and the canonical security considerations. ICP was initially developed by Peter Danzig, et. al. at the Univer- sity of Southern California as a central part of hierarchical caching in the Harvest research project[3].2. Web Cache Hierarchies A single Web cache will reduce the amount of traffic generated by the clients behind it. Similarly, a group of Web caches can benefit by sharing another cache in much the same way. Researchers on the Har- vest project envisioned that it would be important to connect Web caches hierarchically. In a cache hierarchy (or mesh) one cache establishes peering relationships with its neighbor caches. There are two types of relationship: parent and sibling. A parent cache is essentially one level up in a cache hierarchy. A sibling cache is on the same level. The terms "neighbor" and "peer" are used to refer to either parents or siblings which are a single "cache-hop" away. Fig- ure 1 shows a simple hierarchy configuration. But what does it mean to be "on the same level" or "one level up?" The general flow of document requests is up the hierarchy. When a cache does not hold a requested object, it may ask via ICP whether any of its neighbor caches has the object. If any of the neighbors does have the requested object (i.e., a "neighbor hit"), then the cache will request it from them. If none of the neighbors has the object (a "neighbor miss"), then the cache must forward the request either to a parent, or directly to the origin server. The essential difference between a parent and sibling is that a "neighbor hit" may be fetched from either one, but a "neighbor miss" may NOT be fetched from a sibling. In other words, in a sibling relationship, a cache can only ask to retrieve objects that the sibling already has cached, whereas the same cache can ask a parent to retrieve any object regardless of whether or not it is cached. A parent cache's role isWessels & Claffy [Page 3]Internet-Draft 8 Jul 1997 T H E I N T E R N E T =========================== | || | || | || | || | +----------------------+ | | | | | PARENT | | | CACHE | | | | | +----------------------+ | || DIRECT || RETRIEVALS || | || | HITS | AND | MISSES | RESOLVED | || | || | || V \/ +------------------+ +------------------+ | | | | | LOCAL |/--------HITS-------| SIBLING | | CACHE |\------RESOLVED-----| CACHE | | | | | +------------------+ +------------------+ | | | | | | | | | | | | | | | V V V V V =================== CACHE CLIENTS FIGURE 1: A Simple Web cache hierarchy. The local cache can retrieve hits from sibling caches, hits and misses from parent caches, and some requests directly from origin servers. to provide "transit" for the request if necessary, and accordingly parent caches are ideally located within or on the way to a transit Internet service provider (ISP). Squid and Harvest allow for complex hierarchical configurations. For example, one could specify that a given neighbor be used for only a certain class of requests, such as URLs from a specific DNS domain.Wessels & Claffy [Page 4]Internet-Draft 8 Jul 1997 Additionally, it is possible to treat a neighbor as a sibling for some requests and as a parent for others. The cache hierarchy model described here includes a number of fea- tures to prevent top-level caches from becoming choke points. One is the ability to restrict parents as just described previously (by domains). Another optimization is that the cache only forwards cachable requests to its neighbors. A large class of Web requests are inherently uncachable, including: requests requiring certain types of authentication, session-encrypted data, highly personalized responses, and certain types of database queries. Lower level caches should handle these requests directly rather than burdening parent caches.3. What is the Added Value of ICP? Although it is possible to maintain cache hierarchies without using ICP, the lack of ICP or something similar prohibits the existence of sibling meta-communicative relationships, i.e., mechanisms to query nearby caches about a given document. One concern over the use of ICP is the additional delay that an ICP query/reply exchange contributes to an HTTP transaction. However, if the ICP query can locate the object in a nearby neighbor cache, then the ICP delay may be more than offset by the faster delivery of the data from the neighbor. In order to minimize ICP delays, the caches (as well as the protocol itself) are designed to return ICP requests quickly. Indeed, the application does minimal processing of the ICP request, most ICP-related delay is due to transmission on the net- work. ICP also serves to provide an indication of neighbor reachability. If ICP replies from a neighbor fail to arrive, then either the net- work path is congested (or down), or the cache application is not running on the ICP-queried neighbor machine. In either case, the cache should not use this neighbor at this time. Additionally, because an idle cache can turn around the replies faster than a busy one, all other things being equal, ICP provides some form of load balancing.4. Example Configuration of ICP Hierarchy Configuring caches within a hierarchy requires establishing peering relationships, which currently involves manual configuration at both peering endpoints. One cache must indicate that the other is a par- ent or sibling. The other cache will most likely have to add theWessels & Claffy [Page 5]Internet-Draft 8 Jul 1997 first cache to its access control lists. Below we show some sample configuration lines for a hypothetical sit- uation. We have two caches, one operated by an ISP, and another operated by a customer. First we describe how the customer would configure his cache to peer with the ISP. Second, we describe how the ISP would allow the customer access to its cache.4.1. Configuring the `proxy.customer.org' cache In Squid, to configure parents and siblings in a hierarchy, a `cache_host' directive is entered into the configuration file. The format is: cache_host hostname type http-port icp-port [options] Where type is either `parent', `sibling', or `multicast'. For our example, it would be: cache_host cache.isp.com parent 8080 3130 This configuration will cause the customer cache to resolve most cache misses through the parent (`cgi-bin' and non-GET requests would be resolved directly). Utilizing the parent may be undesirable for certain servers, such as servers also in the customer.org domain. To always handle such local domains directly, the customer would add this to his configuration file: local_domain customer.org It may also be the case that the customer wants to use the ISP cache only for a specific subset of DNS domains. The need to limit requests this way is actually more common for higher levels of cache hierarchies, but it is illustrated here nonetheless. To limit the ISP cache to a subset of DNS domains, the customer would use: cache_host_domain cache.isp.com com net org Then, any requests which are NOT in the .com, .net, or .org domains would be handled directly.4.2. Configuring the `cache.isp.com' cache To configure the query-receiving side of the cache peer relationship one uses access lists, similar to those used in routing peers. The access lists support a large degree of customization in the peering
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -