📄 rfc2187.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 4 页
字号:
RFC 2187                          ICP                     September 1997


5.3.8.  ICP_OP_ERR

   Silently ignored.

5.3.9.  When all peers MISS.

   For ICP_OP_HIT and ICP_OP_SECHO the request is forwarded immediately.
   For ICP_OP_HIT_OBJ there is no need to forward the request.  For all
   other reply opcodes, we wait until the expected number of replies
   have been received.  When we have all of the expected replies, or
   when the query timeout occurs, it is time to forward the request.

   Since MISS replies were received from all peers, we must either
   select a parent cache or the origin server.

   o    If the peers are using the ICP_FLAG_SRC_RTT feature, we forward
        the request to the peer with the lowest RTT to the origin
        server.  If the local cache is also measuring RTT's to origin
        servers, and is closer than any of the parents, the request is
        forwarded directly to the origin server.

   o    If there is a FIRST_PARENT_MISS parent available, the request
        will be forwarded there.

   o    If the ICP query/reply exchange did not produce any appropriate
        parents, the request will be sent directly to the origin server
        (unless firewall restrictions prevent it).

5.4.  ICP Options

   The following options were added to Squid to support some new
   features while maintaining backward compatibility with the Harvest
   implementation.

5.4.1.  ICP_FLAG_HIT_OBJ

   This flag is off by default and will be set in an ICP_OP_QUERY
   message only if these three criteria are met:

   o    It is enabled in the cache configuration file with `udp_hit_obj
        on'.

   o    The peer must be using ICP version 2.

   o    The HTTP request must not include the "Pragma: no-cache" header.






Wessels & Claffy             Informational                     [Page 13]

RFC 2187                          ICP                     September 1997


5.4.2.  ICP_FLAG_SRC_RTT

   This flag is off by default and will be set in an ICP_OP_QUERY
   message only if these two criteria are met:

   o    It is enabled in the cache configuration file with `query_icmp
        on'.

   o    The peer must be using ICP version 2.


6.  Firewalls

   Operating a Web cache behind a firewall or in a private network poses
   some interesting problems.  The hard part is figuring out whether the
   cache is able to connect to the origin server.  Harvest and Squid
   provide an `inside_firewall' configuration directive to list DNS
   domains on the near side of a firewall.  Everything else is assumed
   to be on the far side of a firewall.  Squid also has a `firewall_ip'
   directive so that inside hosts can be specified by IP addresses as
   well.

   In a simple configuration, a Squid cache behind a firewall will have
   only one parent cache (which is on the firewall itself).  In this
   case, Squid must use that parent for all servers beyond the firewall,
   so there is no need to utilize ICP.

   In a more complex configuration, there may be a number of peer caches
   also behind the firewall.  Here, ICP may be used to check for cache
   hits in the peers.  Occasionally, when ICP is being used, there may
   not be any replies received.  If the cache were not behind a
   firewall, the request would be forwarded directly to the origin
   server.  But in this situation, the cache must pick a parent cache,
   either randomly or due to configuration information.  For example,
   Squid allows a parent cache to be designated as a default choice when
   no others are available.

7.  Multicast

   For efficient distribution, a cache may deliver ICP queries to a
   multicast address, and neighbor caches may join the multicast group
   to receive such queries.

   Current practice is that caches send ICP replies only to unicast
   addresses, for several reasons:

   o    Multicasting ICP replies would not reduce the number of packets
        sent.



Wessels & Claffy             Informational                     [Page 14]

RFC 2187                          ICP                     September 1997


   o    It prevents other group members from receiving unexpected
        replies.

   o    The reply should follow unicast routing paths to indicate
        (unicast) connectivity between the receiver and the sender since
        the subsequent HTTP request will be unicast routed.

   Trust is an important aspect of inter-cache relationships.  A Web
   cache should not automatically trust any cache which replies to a
   multicast ICP query.  Caches should ignore ICP messages from
   addresses not specifically configured as neighbors.  Otherwise, one
   could easily pollute a cache mesh by running an illegitimate cache
   and having it join a group, return ICP_OP_HIT for all requests, and
   then deliver bogus content.

   When sending to multicast groups, cache administrators must be
   careful to use the minimum multicast TTL required to reach all group
   members.  Joining a multicast group requires no special privileges
   and there is no way to prevent anyone from joining "your" group.  Two
   groups of caches utilizing the same multicast address could overlap,
   which would cause a cache to receive ICP replies from unknown
   neighbors.  The unknown neighbors would not be used to retrieve the
   object data, but the cache would constantly receive ICP replies that
   it must always ignore.

   To prevent an overlapping cache mesh, caches should thus limit the
   scope of their ICP queries with appropriate TTLs; an application such
   as mtrace[6] can determine appropriate multicast TTLs.

   As mentioned in section 5.1.3, we need to estimate the number of
   expected replies for an ICP_OP_QUERY message.  For unicast we expect
   one reply for each query if the peer is up.  However, for multicast
   we generally expect more than one reply, but have no way of knowing
   exactly how many replies to expect.  Squid regularly (every 15
   minutes) sends out test ICP_OP_QUERY messages to only the multicast
   group peers.  As with a real ICP query, a timeout event is installed
   and the replies are counted until the timeout occurs.  We have found
   that the received count varies considerably.  Therefore, the number
   of replies to expect is calculated as a moving average, rounded down
   to the nearest integer.











Wessels & Claffy             Informational                     [Page 15]

RFC 2187                          ICP                     September 1997


8.  Lessons Learned

8.1.  Differences Between ICP and HTTP

   ICP is notably different from HTTP.  HTTP supports a rich and
   sophisticated set of features.  In contrast, ICP was designed to be
   simple, small, and efficient.  HTTP request and reply headers consist
   of lines of ASCII text delimited by a CRLF pair, whereas ICP uses a
   fixed size header and represents numbers in binary.  The only thing
   ICP and HTTP have in common is the URL.

   Note that the ICP message does not even include the HTTP request
   method.  The original implementation assumed that only GET requests
   would be cachable and there would be no need to locate non-GET
   requests in neighbor caches.  Thus, the current version of ICP does
   not accommodate non-GET requests, although the next version of this
   protocol will likely include a field for the request method.

   HTTP defines features that are important for caching but not
   expressible with the current ICP protocol.  Among these are Pragma:
   no-cache, If-Modified-Since, and all of the Cache-Control features of
   HTTP/1.1.  An ICP_OP_HIT_OBJ message may deliver an object which may
   not obey all of the request header constraints.  These differences
   between ICP and HTTP are the reason we discourage the use of the
   ICP_OP_HIT_OBJ feature.

8.2.  Parents, Siblings, Hits and Misses

   Note that the ICP message does not have a field to indicate the
   intent of the querying cache.  That is, nowhere in the ICP request or
   reply does it say that the two caches have a sibling or parent
   relationship.  A sibling cache can only respond with HIT or MISS, not
   "you can retrieve this from me" or "you can not retrieve this from
   me."  The querying cache must apply the HIT or MISS reply to its
   local configuration to prevent it from resolving misses through a
   sibling cache.  This constraint is awkward, because this aspect of
   the relationship can be configured only in the cache originating the
   requests, and indirectly via the access controls configured in the
   queried cache as described earlier in section 4.2.












Wessels & Claffy             Informational                     [Page 16]

RFC 2187                          ICP                     September 1997


8.3.  Different Roles of ICP

   There are two different understandings of what exactly the role of
   ICP is in a cache mesh.  One understanding is that ICP's role is only
   object location, specifically, to provide hints about whether or not
   a named object exists in a neighbor cache.  An implied assumption is
   that cache hits are highly desirable, and ICP is used to maximize the
   chance of getting them.  If an ICP message is lost due to congestion,
   then nothing significant is lost; the request will be satisfied
   regardless.

   ICP is increasingly being tasked to fill a more complex role:
   conveying cache usage policy.  For example, many organizations (e.g.
   universities) will install a Web cache on the border of their
   network.  Such organizations may be happy to establish sibling
   relationships with other, nearby caches, subject to the following
   terms:

   o    Any of the organization's customers or users may request any
        object (cached or not).

   o    Anyone may request an object already in the cache.

   o    Anyone may request any object from the organization's servers
        behind the cache.

   o    All other requests are denied; specifically, the organization
        will not provide transit for requests in which neither the
        client nor the server falls within its domain.

   To successfully convey policy the ICP exchange must very accurately
   predict the result (hit, miss) of a subsequent HTTP request.  The
   result may often depend on other request fields, such as Cache-
   Control.  So it's not possible for ICP to accurately predict the
   result without more, or perhaps all, of the HTTP request.

8.4.  Protocol Design Flaws of ICPv2

   We recognize certain flaws with the original design of ICP, and make
   note of them so that future versions can avoid the same mistakes.

   o    The NULL-terminated URL in the payload field requires stepping
        through the message an octet at a time to find some of the
        fields (i.e. the beginning of object data in an ICP_OP_HIT_OBJ
        message).






Wessels & Claffy             Informational                     [Page 17]

RFC 2187                          ICP                     September 1997


   o    Two fields (Sender Host Address and Requester Host Address) are
        IPv4 specific.  However, neither of these fields are used in
        practice; they are normally zero-filled.  If IP addresses have a
        role in the ICP message, there needs to be an address family
        descriptor for each address, and clients need to be able to say
        whether they want to hear IPv6 responses or not.

   o    Options are limited to 32 option flags and 32 bits of option
        data.  This should be more like TCP, with an option descriptor
        followed by option data.

   o    Although currently used as the cache key, the URL string no
        longer serves this role adequately.  Some HTTP responses now
        vary according to the requestor's User-Agent and other headers.
        A cache key must incorporate all non-transport headers present
        in the client's request.  All non-hop-by-hop request headers
        should be sent in an ICP query.

   o    ICPv2 uses different opcode values for queries and responses.
        ICP should use the same opcode for both sides of a two-sided
        transaction, with a "query/response" indicator telling which
        side is which.

   o    ICPv2 does not include any authentication fields.

9.  Security Considerations

   Security is an issue with ICP over UDP because of its connectionless
   nature.  Below we consider various vulnerabilities and methods of
   attack, and their implications.

   Our first line of defense is to check the source IP address of the
   ICP message, e.g. as given by recvfrom(2).  ICP query messages should
   be processed if the access control rules allow the querying address
   access to the cache.  However, ICP reply messages must only be
   accepted from known neighbors; a cache must ignore replies from
   unknown addresses.

   Because we trust the validity of an address in an IP packet, ICP is
   susceptible to IP address spoofing.  In this document we address some
   consequences of IP address spoofing.  Normally, spoofed addresses can
   only be detected by routers, not by hosts.  However, the IP
   Authentication Header[7,8] can be used underneath ICP to provide
   cryptographic authentication of the entire IP packet containing the
   ICP protocol, thus eliminating the risk of IP address spoofing.






Wessels & Claffy             Informational                     [Page 18]

RFC 2187                          ICP                     September 1997
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -