📄 icpv2-application.txt

📁 -
💻 TXT
📖 第 1 页 / 共 4 页
字号:
Wessels & Claffy                                                [Page 6]Internet-Draft                                                8 Jul 1997   relationship.  If there are no access lines present, the cache allows   the request by default.   Note that the cache.isp.com cache need not explicitly specify the   customer cache as a peer, nor is the type of relationship encoded   within the ICP query itself.  The access control entries regulate the   relationships between this cache and its neighbors.  For our example,   the ISP would use:       acl src Customer  proxy.customer.org       http_access allow Customer       icp_access  allow Customer   This defines an access control entry named `Customer' which specifies   a source IP address of the customer cache machine.  The customer   cache would then be allowed to make any request to both the HTTP and   ICP ports (including cache misses).  This configuration implies that   the ISP cache is a parent of the customer.   If the ISP wanted to enforce a sibling relationship, it would need to   deny access to cache misses.  This would be done as follows:       miss_access deny Customer   Of course the ISP should also communicate this to the customer, so   that the customer will change his configuration from parent to sib-   ling.  Otherwise, if the customer requests an object not in the ISP   cache, an error message is generated.5.  Applying the Protocol   The following sections describe the ICP implementation in the Har-   vest[3] (research version) and Squid Web cache[5] packages.  In terms   of version numbers, this means version 1.4pl2 for Harvest and version   1.1.10 for Squid.   The basic sequence of events in an ICP transaction is as follows:   1.   Local cache receives an HTTP[1] request from a cache client.   2.   The local cache sends ICP queries (section 5.1).   3.   The peer cache(s) receive the queries and send ICP replies (sec-        tion 5.2).   4.   The local cache receives the ICP replies and decides where to        forward the request (section 5.3).Wessels & Claffy                                                [Page 7]Internet-Draft                                                8 Jul 19975.1.  Sending ICP Queries5.1.1.  Determine whether to use ICP at all   Not every HTTP request requires an ICP query to be sent.  Obviously,   cache hits will not need ICP because the request is satisfied immedi-   ately.  For origin servers very close to the cache, we do not want to   use any neighbor caches.  In Squid and Harvest, the administrator   specifies what constitutes a `local' server with the `local_domain'   and `local_ip' configuration options.  The cache always contacts a   local server directly, never querying a peer cache.   There are other classes of requests that the cache (or the adminis-   trator) may prefer to forward directly to the origin server.  In   Squid and Harvest, one such class includes all non-GET request meth-   ods.  A Squid cache can also be configured to not use peers for URLs   matching the `hierarchy_stoplist'.   In order for an HTTP request to yield an ICP transaction, it must:   o    not be a cache hit   o    not be to a local server   o    be a GET request, and   o    not match the `hierarchy_stoplist' configuration.   We call this a "hierarchical" request.  A "non-hierarchical" request   is one that doesn't generate any ICP traffic.  To avoid processing   requests that are likely to lower cache efficiency, one can configure   the cache to not consult the hierarchy for URLs that contain certain   strings (e.g. `cgi_bin').5.1.2.  Determine which peers to query   By default, a cache sends an ICP_OP_QUERY message to each peer,   unless any one of the following are true:   o    Restrictions prevent querying a peer for this request, based on        the configuration directive `cache_host_domain', which specifies        a set of DNS domains (from the URLs) for which the peer should        or should not be queried.  In Squid, a more flexible directive        ('cache_host_acl') supports restrictions on other parts of the        request (method, port number, source, etc.).Wessels & Claffy                                                [Page 8]Internet-Draft                                                8 Jul 1997   o    The peer is a sibling, and the HTTP request includes a "Pragma:        no-cache" header.  This is because the sibling would be asked to        transit the request, which is not allowed.   o    The peer is configured to never be sent ICP queries (i.e. with        the `no-query' option).   If the determination yields only one queryable ICP peer, and the   Squid configuration directive `single_parent_bypass' is set, then one   can bypass waiting for the single ICP response and just send the HTTP   request directly to the peer cache.   The Squid configuration option `source_ping' configures a Squid cache   to send a ping to the original source simultaneous with its ICP   queries, in case the origin is closer than any of the caches.5.1.3.  Calculate the expected number of ICP replies   Harvest and Squid want to maximize the chance to get a HIT reply from   one of the peers.  Therefore, the cache waits for all ICP replies to   be received.  Normally, we expect to receive an ICP reply for each   query sent, except:   o    When the peer is believed to be down.  If the peer is down Squid        and Harvest continue to send it ICP queries, but do not expect        the peer to reply.  When an ICP reply is again received from the        peer, its status will be changed to up.        The determination of up/down status has varied a little bit as        the Harvest and Squid software evolved.  Both Harvest and Squid        mark a peer down when it fails to reply to 20 consecutive ICP        queries.  Squid also marks a peer down when a TCP connection        fails, and up again when a diagnostic TCP connection succeeds.   o    When sending to a multicast address.  In this case we'll proba-        bly expect to receive more than one reply, and have no way to        definitively determine how many to expect.  We discuss multicast        issues in section 7 below.5.1.4.  Install timeout event   Because ICP uses UDP as underlying transport, ICP queries and replies   may sometimes be dropped by the network.  The cache installs a time-   out event in case not all of the expected replies arrive.  By default   Squid and Harvest use a two-second timeout.  If object retrieval has   not commenced when the timeout occurs, a source is selected asWessels & Claffy                                                [Page 9]Internet-Draft                                                8 Jul 1997   described in section 5.3.9 below.5.2.  Receiving ICP Queries and Sending Replies   When an ICP_OP_QUERY message is received, the cache examines it and   decides which reply message is to be sent.  It will send one of the   following reply opcodes, tested for use in the order listed:5.2.1.  ICP_OP_ERR   The URL is extracted from the payload and parsed.  If parsing fails,   an ICP_OP_ERR message is returned.5.2.2.  ICP_OP_DENIED   The access controls are checked.  If the peer is not allowed to make   this request, ICP_OP_DENIED is returned.  Squid counts the number of   ICP_OP_DENIED messages sent to each peer.  If more than 95% of more   than 100 replies have been denied, then no reply is sent at all.   This prevents misconfigured caches from endlessly sending unnecessary   ICP messages back and forth.5.2.3.  ICP_OP_HIT   If the cache reaches this point without already matching one of the   previous  opcodes, it means the request is allowed and we must deter-   mine if it will be HIT or MISS, so we check if the URL exists in the   local cache.  If so, and if the cached entry is fresh for at least   the next 30 seconds, we can return an ICP_OP_HIT message.  The   stale/fresh determination uses the local refresh (or TTL) rules.   Note that a race condition exists for ICP_OP_HIT replies to sibling   peers.  The ICP_OP_HIT means that a subsequent HTTP request for the   named URL would result in a cache hit.  We assume that the HTTP   request will come very quickly after the ICP_OP_HIT.  However, there   is a slight chance that the object might be purged from this cache   before the HTTP request is received.  If this happens, and the reply-   ing peer has applied Squid's `miss_access' configuration then the   user will receive a very confusing access denied message.5.2.3.1.  ICP_OP_HIT_OBJ   Before returning the ICP_OP_HIT message, we see if we can send anWessels & Claffy                                               [Page 10]Internet-Draft                                                8 Jul 1997   ICP_OP_HIT_OBJ message instead.  We can use ICP_OP_HIT_OBJ if:   o    The ICP_OP_QUERY message had the ICP_FLAG_HIT_OBJ flag set.   o    The entire object (plus URL) will fit in an ICP message.  The        maximum ICP message size is 16 Kbytes, but an application may        choose to set a smaller maximum value for ICP_OP_HIT_OBJ        replies.   Normally ICP replies are sent immediately after the query is   received, but the ICP_OP_HIT_OBJ message cannot be sent until the   object data is available to copy into the reply message.  For Squid   and Harvest this means the object must be "swapped in" from disk if   it is not already in memory.  Therefore, on average, an   ICP_OP_HIT_OBJ reply will have higher latency than ICP_OP_HIT.5.2.4.  ICP_OP_MISS_NOFETCH   At this point we have a cache miss.  ICP has two types of miss   replies.  If the cache does not want the peer to request the object   from it, it sends an ICP_OP_MISS_NOFETCH message.5.2.5.  ICP_OP_MISS   Finally, an ICP_OP_MISS reply is returned as the default.  If the   replying cache is a parent of the querying cache, the ICP_OP_MISS   indicates an invitation to fetch the URL through the replying cache.5.3.  Receiving ICP Replies   Some ICP replies will be ignored; specifically, when any of the fol-   lowing are true:   o    The reply message originated from an unknown peer.   o    The object named by the URL does not exist.   o    The object is already being fetched.5.3.1.  ICP_OP_DENIED   If more than 95% of more than 100 replies from a peer cache have been   ICP_OP_DENIED, then such a high denial rate most likely indicates a   configuration error, either locally or at the peer.  For this reason,Wessels & Claffy                                               [Page 11]Internet-Draft                                                8 Jul 1997   no further queries will be sent to the peer for the duration of the   cache process.5.3.2.  ICP_OP_HIT   Object retrieval commences immediately from the replying peer.5.3.3.  ICP_OP_HIT_OBJ   The object data is extracted from the ICP message and the retrieval   is complete.  If there is some problem with the ICP_OP_HIT_OBJ mes-   sage (e.g. missing data) the reply will be treated like a standard   ICP_OP_HIT.5.3.4.  ICP_OP_SECHO   Object retrieval commences immediately from the origin server because   the ICP_OP_SECHO reply arrived prior to any ICP_OP_HIT's.  If an   ICP_OP_HIT had arrived prior, this ICP_OP_SECHO reply would be   ignored because the retrieval has already started.5.3.5.  ICP_OP_DECHO   An ICP_OP_DECHO reply is handled like an ICP_OP_MISS.  Non-ICP peers   must always be configured as parents; a non-ICP sibling makes no   sense.  One serious problem with the ICP_OP_DECHO feature is that   since it bounces messages off the peer's UDP echo port, it does not   indicate that the peer cache is actually running -- only that network   connectivity exists between the pair.5.3.6.  ICP_OP_MISS   If the peer is a sibling, the ICP_OP_MISS reply is ignored.  Other-   wise, the peer may be "remembered" for future use in case no HIT   replies are received later (section 5.3.9).   Harvest and Squid remember the first parent to return an ICP_OP_MISS   message.  With Squid, the parents may be weighted so that the "first   parent to miss" may not actually be the first reply received.  We   call this the FIRST_PARENT_MISS.  Remember that sibling misses are   entirely ignored, we only care about misses from parents.  The parent   miss RTT's can be weighted because sometimes the closest parent is   not the one people want to use.Wessels & Claffy                                               [Page 12]
💿 文件大小 839 K
👤 上传用户 boobyb
📂 所属分类 Internet/网络编程
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -