📄 rfc2187.txt
字号:
Network Working Group D. Wessels
Request for Comments: 2187 K. Claffy
Category: Informational National Laboratory for Applied
Network Research/UCSD
September 1997
Application of Internet Cache Protocol (ICP), version 2
Status of this Memo
This memo provides information for the Internet community. This memo
does not specify an Internet standard of any kind. Distribution of
this memo is unlimited.
Abstract
This document describes the application of ICPv2 (Internet Cache
Protocol version 2, RFC2186) to Web caching. ICPv2 is a lightweight
message format used for communication among Web caches. Several
independent caching implementations now use ICP[3,5], making it
important to codify the existing practical uses of ICP for those
trying to implement, deploy, and extend its use.
ICP queries and replies refer to the existence of URLs (or objects)
in neighbor caches. Caches exchange ICP messages and use the
gathered information to select the most appropriate location from
which to retrieve an object. A companion document (RFC2186)
describes the format and syntax of the protocol itself. In this
document we focus on issues of ICP deployment, efficiency, security,
and interaction with other aspects of Web traffic behavior.
Table of Contents
1. Introduction................................................. 2
2. Web Cache Hierarchies........................................ 3
3. What is the Added Value of ICP?.............................. 5
4. Example Configuration of ICP Hierarchy....................... 5
4.1. Configuring the `proxy.customer.org' cache................. 6
4.2. Configuring the `cache.isp.com' cache...................... 6
5. Applying the Protocol........................................ 7
5.1. Sending ICP Queries........................................ 8
5.2. Receiving ICP Queries and Sending Replies.................. 10
5.3. Receiving ICP Replies...................................... 11
5.4. ICP Options................................................ 13
6. Firewalls.................................................... 14
7. Multicast.................................................... 14
8. Lessons Learned.............................................. 16
8.1. Differences Between ICP and HTTP........................... 16
Wessels & Claffy Informational [Page 1]
RFC 2187 ICP September 1997
8.2. Parents, Siblings, Hits and Misses......................... 16
8.3. Different Roles of ICP..................................... 17
8.4. Protocol Design Flaws of ICPv2............................. 17
9. Security Considerations...................................... 18
9.1. Inserting Bogus ICP Queries................................ 19
9.2. Inserting Bogus ICP Replies................................ 19
9.3. Eavesdropping.............................................. 20
9.4. Blocking ICP Messages...................................... 20
9.5. Delaying ICP Messages...................................... 20
9.6. Denial of Service.......................................... 20
9.7. Altering ICP Fields........................................ 21
9.8. Summary.................................................... 22
10. References................................................... 23
11. Acknowledgments.............................................. 24
12. Authors' Addresses........................................... 24
1. Introduction
ICP is a lightweight message format used for communicating among Web
caches. ICP is used to exchange hints about the existence of URLs in
neighbor caches. Caches exchange ICP queries and replies to gather
information for use in selecting the most appropriate location from
which to retrieve an object.
This document describes the implementation of ICP in software. For a
description of the protocol and message format, please refer to the
companion document (RFC2186). We avoid making judgments about
whether or how ICP should be used in particular Web caching
configurations. ICP may be a "net win" in some situations, and a
"net loss" in others. We recognize that certain practices described
in this document are suboptimal. Some of these exist for historical
reasons. Some aspects have been improved in later versions. Since
this document only serves to describe current practices, we focus on
documenting rather than evaluating. However, we do address known
security problems and other shortcomings.
The remainder of this document is written as follows. We first
describe Web cache hierarchies, explain motivation for using ICP, and
demonstrate how to configure its use in cache hierarchies. We then
provide a step-by-step description of an ICP query-response
transaction. We then discuss ICP interaction with firewalls, and
briefly touch on multicasting ICP. We end with lessons with have
learned during the protocol development and deployement thus far, and
the canonical security considerations.
ICP was initially developed by Peter Danzig, et. al. at the
University of Southern California as a central part of hierarchical
caching in the Harvest research project[3].
Wessels & Claffy Informational [Page 2]
RFC 2187 ICP September 1997
2. Web Cache Hierarchies
A single Web cache will reduce the amount of traffic generated by the
clients behind it. Similarly, a group of Web caches can benefit by
sharing another cache in much the same way. Researchers on the
Harvest project envisioned that it would be important to connect Web
caches hierarchically. In a cache hierarchy (or mesh) one cache
establishes peering relationships with its neighbor caches. There
are two types of relationship: parent and sibling. A parent cache is
essentially one level up in a cache hierarchy. A sibling cache is on
the same level. The terms "neighbor" and "peer" are used to refer to
either parents or siblings which are a single "cache-hop" away.
Figure 1 shows a simple hierarchy configuration.
But what does it mean to be "on the same level" or "one level up?"
The general flow of document requests is up the hierarchy. When a
cache does not hold a requested object, it may ask via ICP whether
any of its neighbor caches has the object. If any of the neighbors
does have the requested object (i.e., a "neighbor hit"), then the
cache will request it from them. If none of the neighbors has the
object (a "neighbor miss"), then the cache must forward the request
either to a parent, or directly to the origin server. The essential
difference between a parent and sibling is that a "neighbor hit" may
be fetched from either one, but a "neighbor miss" may NOT be fetched
from a sibling. In other words, in a sibling relationship, a cache
can only ask to retrieve objects that the sibling already has cached,
whereas the same cache can ask a parent to retrieve any object
regardless of whether or not it is cached. A parent cache's role is
Wessels & Claffy Informational [Page 3]
RFC 2187 ICP September 1997
T H E I N T E R N E T
===========================
| ||
| ||
| ||
| ||
| +----------------------+
| | |
| | PARENT |
| | CACHE |
| | |
| +----------------------+
| ||
DIRECT ||
RETRIEVALS ||
| ||
| HITS
| AND
| MISSES
| RESOLVED
| ||
| ||
| ||
V \/
+------------------+ +------------------+
| | | |
| LOCAL |/--------HITS-------| SIBLING |
| CACHE |\------RESOLVED-----| CACHE |
| | | |
+------------------+ +------------------+
| | | | |
| | | | |
| | | | |
V V V V V
===================
CACHE CLIENTS
FIGURE 1: A Simple Web cache hierarchy. The local cache can retrieve
hits from sibling caches, hits and misses from parent caches, and
some requests directly from origin servers.
to provide "transit" for the request if necessary, and accordingly
parent caches are ideally located within or on the way to a transit
Internet service provider (ISP).
Squid and Harvest allow for complex hierarchical configurations. For
example, one could specify that a given neighbor be used for only a
certain class of requests, such as URLs from a specific DNS domain.
Wessels & Claffy Informational [Page 4]
RFC 2187 ICP September 1997
Additionally, it is possible to treat a neighbor as a sibling for
some requests and as a parent for others.
The cache hierarchy model described here includes a number of
features to prevent top-level caches from becoming choke points. One
is the ability to restrict parents as just described previously (by
domains). Another optimization is that the cache only forwards
cachable requests to its neighbors. A large class of Web requests
are inherently uncachable, including: requests requiring certain
types of authentication, session-encrypted data, highly personalized
responses, and certain types of database queries. Lower level caches
should handle these requests directly rather than burdening parent
caches.
3. What is the Added Value of ICP?
Although it is possible to maintain cache hierarchies without using
ICP, the lack of ICP or something similar prohibits the existence of
sibling meta-communicative relationships, i.e., mechanisms to query
nearby caches about a given document.
One concern over the use of ICP is the additional delay that an ICP
query/reply exchange contributes to an HTTP transaction. However, if
the ICP query can locate the object in a nearby neighbor cache, then
the ICP delay may be more than offset by the faster delivery of the
data from the neighbor. In order to minimize ICP delays, the caches
(as well as the protocol itself) are designed to return ICP requests
quickly. Indeed, the application does minimal processing of the ICP
request, most ICP-related delay is due to transmission on the
network.
ICP also serves to provide an indication of neighbor reachability.
If ICP replies from a neighbor fail to arrive, then either the
network path is congested (or down), or the cache application is not
running on the ICP-queried neighbor machine. In either case, the
cache should not use this neighbor at this time. Additionally,
because an idle cache can turn around the replies faster than a busy
one, all other things being equal, ICP provides some form of load
balancing.
4. Example Configuration of ICP Hierarchy
Configuring caches within a hierarchy requires establishing peering
relationships, which currently involves manual configuration at both
peering endpoints. One cache must indicate that the other is a
parent or sibling. The other cache will most likely have to add the
first cache to its access control lists.
Wessels & Claffy Informational [Page 5]
RFC 2187 ICP September 1997
Below we show some sample configuration lines for a hypothetical
situation. We have two caches, one operated by an ISP, and another
operated by a customer. First we describe how the customer would
configure his cache to peer with the ISP. Second, we describe how
the ISP would allow the customer access to its cache.
4.1. Configuring the `proxy.customer.org' cache
In Squid, to configure parents and siblings in a hierarchy, a
`cache_host' directive is entered into the configuration file. The
format is:
cache_host hostname type http-port icp-port [options]
Where type is either `parent', `sibling', or `multicast'. For our
example, it would be:
cache_host cache.isp.com parent 8080 3130
This configuration will cause the customer cache to resolve most
cache misses through the parent (`cgi-bin' and non-GET requests would
be resolved directly). Utilizing the parent may be undesirable for
certain servers, such as servers also in the customer.org domain. To
always handle such local domains directly, the customer would add
this to his configuration file:
local_domain customer.org
It may also be the case that the customer wants to use the ISP cache
only for a specific subset of DNS domains. The need to limit
requests this way is actually more common for higher levels of cache
hierarchies, but it is illustrated here nonetheless. To limit the
ISP cache to a subset of DNS domains, the customer would use:
cache_host_domain cache.isp.com com net org
Then, any requests which are NOT in the .com, .net, or .org domains
would be handled directly.
4.2. Configuring the `cache.isp.com' cache
To configure the query-receiving side of the cache peer
relationship one uses access lists, similar to those used in routing
peers. The access lists support a large degree of customization in
the peering relationship. If there are no access lines present, the
cache allows the request by default.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -