📄 prog-guide.sgml
字号:
functions such as ICP query timeouts. <sect2>Filedescriptor Management<P><em/Files:/ <tt/fd.c/<P> Here we track the number of filedescriptors in use, and the number of bytes which has been read from or written to each file descriptor.<sect2>Hashtable Support<P><em/Files:/ <tt/hash.c/<P> These routines implement generic hash tables. A hash table is created with a function for hashing the key values, and a function for comparing the key values.<sect2>HTTP Anonymization<P><em/Files:/ <tt/http-anon.c/<P> These routines support anonymizing of HTTP requests leaving the cache. Either specific request headers will be removed (the ``standard'' mode), or only specific request headers will be allowed (the ``paranoid'' mode).<sect2>Internet Cache Protocol<P><em/Files:/ <tt/icp_v2.c/, <tt/icp_v3.c/<P> Here we implement the Internet Cache Protocol. This protocol is documented in the RFC 2186 and RFC 2187. The bulk of code is in the <tt/icp_v2.c/ file. The other, <tt/icp_v3.c/ is a single function for handling ICP queries from Netcache/Netapp caches; they use a different version number and a slightly different message format.<sect2>Ident Lookups<P><em/Files:/ <tt/ident.c/<P> These routines support RFC 931 ``Ident'' lookups. An ident server running on a host will report the user name associated with a connected TCP socket. Some sites use this facility for access control and logging purposes.<sect2>Memory Management<P><em/Files:/ <tt/mem.c/<P> These routines allocate and manage pools of memory for frequently-used data structures. When the <em/memory_pools/ configuration option is enabled, unused memory is not actually freed. Instead it is kept for future use. This may result in more efficient use of memory at the expense of a larger process size.<sect2>Multicast Support<P><em/Files:/ <tt/multicast.c/<P> Currently, multicast is only used for ICP queries. The routines in this file implement joining a UDP socket to a multicast group (or groups), and setting the multicast TTL value on outgoing packets.<sect2>Persistent Server Connections<P><em/Files:/ <tt/pconn.c/<P> These routines manage idle, persistent HTTP connections to origin servers and neighbor caches. Idle sockets are indexed in a hash table by their socket address (IP address and port number). Up to 10 idle sockets will be kept for each socket address, but only for 15 seconds. After 15 seconds, idle socket connections are closed.<sect2>Refresh Rules<P><em/Files:/ <tt/refresh.c/<P> These routines decide wether a cached object is stale or fresh, based on the <em/refresh_pattern/ configuration options. If an object is fresh, it can be returned as a cache hit. If it is stale, then it must be revalidated with an If-Modified-Since request.<sect2>SNMP Support<P><em/Files:/ <tt/snmp.c/, <tt/snmp_agent.c/, <tt/snmp_config.c/, <tt/snmp_vars.c/<P> These routines implement SNMP for Squid. At the present time, we have made almost all of the cachemgr information available via SNMP.<sect2>URN Support<P><em/Files:/ <tt/urn.c/<P>We are experimenting with URN support in Squid version 1.2. Note,we're not talking full-blown generic URN's here. This is primarilytargeted towards using URN's as an smart way of handling lists ofmirror sites. For more details, please see<url url="http://squid.nlanr.net/Squid/urn-support.html" name="URN support in Squid">.<sect1>External Programs<sect2>dnsserver<P><em/Files:/ <tt/dnsserver.c/<P> Because the standard <tt/gethostbyname(3)/ library call blocks, Squid must use external processes to actually make these calls. Typically there will be ten <tt/dnsserver/ processes spawned from Squid. Communication occurs via TCP sockets bound to the loopback interface. The functions in <tt/dns.c/ are primarily concerned with starting and stopping the dnsservers. Reading and writing to and from the dnsservers occurs in the IP and FQDN cache modules.<sect2>pinger<P><em/Files:/ <tt/pinger.c/<P> Although it would be possible for Squid to send and receive ICMP messages directly, we use an external process for two important reasons: <enum> <item>Because squid handles many filedescriptors simultaneously, we get much more accurate RTT measurements when ICMP is handled by a separate process. <item>Superuser privileges are required to send and receive ICMP. Rather than require Squid to be started as root, we prefer to have the smaller and simpler <em/pinger/ program installed with setuid permissions. </enum> <sect2>unlinkd<P><em/Files:/ <tt/unlinkd.c/<P> The <tt/unlink(2)/ system call can cause a process to block for a significant amount of time. Therefore we do not want to make unlink() calls from Squid. Instead we pass them to this external process.<sect2>redirector<P><em/Files:/ user-developed<P> A redirector process reads URLs on stdin and writes (possibly changed) URLs on stdout. It is implemented as an external process to maximize flexibility.<sect1>Sequence of a Typical Request<P><enum><item>A client connection is accepted by the <em/client-side/. The HTTP requestis parsed.<item>The access controls are checked. The client-side builds anACL state data structure and registers a callback functionfor notification when access control checking is completed.<item>After the access controls have been verified, the client-side looks forthe requested object in the cache. If is a cache hit, then theclient-side registers its interest in the <em/StoreEntry/. Otherwise,Squid needs to forward the request, perhaps with an If-Modified-Sinceheader.<item>The request-forwarding process begins with <tt/protoDispatch/.This function begins the peer selection procedure, which mayinvolve sending ICP queries and receiving ICP replies. The peerselection procedure also involves checking configurationoptions such as <em/never_direct/ and <em/always_direct/.<item>When the ICP replies (if any) have been processed, we end upat <em/protoStart/. This function calls an appropriate protocol-specific function for forwarding the request. Here wewill assume it is an HTTP request.<item>The HTTP module first opens a connection to the origin serveror cache peer. If there is no idle persistent socket available,a new connection request is given to the Network Communicationmodule with a callback function. The <tt/comm.c/ routinesmay try establishing a connection multiple times before giving up.<item>When a TCP connection has been established, HTTP builds a requestbuffer and submits it for writing on the socket. It then registersa read handler to receive and process the HTTP reply.<item>As the reply is initially received, the HTTP reply headers areparsed and placed into a reply data structure. As reply datais read, it is appended to the <em/StoreEntry/. Every time datais appended to the <em/StoreEntry/, the client-side is notified of the new data via a callback function.<item>As the client-side is notified of new data, it copies the datafrom the StoreEntry and submits it for writing on the client socket.<item>As data is appended to the <em/StoreEntry/, and the client(s)read it, the data may be submitted for writing to disk.<item>When the HTTP module finishes reading the reply from the upstreamserver, it marks the <em/StoreEntry/ as ``complete.'' The serversocket is either closed or given to the persistent connection poolfor future use.<item>When the client-side has written all of the object data, it unregistersitself from the <em/StoreEntry/. At the same time it either waits foranother request from the client, or closes the client connection.</enum><!-- %%%% Chapter : MAIN LOOP %%%% --><sect>The Main Loop: <tt/comm_select()/<P>At the core of Squid is the <tt/select(2)/ system call. Squid uses<tt/select()/ or <tt/poll(2)/ to process I/O on all open file descriptors.Hereafter we'll only use ``select'' to refer generically to either system call.<P>The <tt/select()/ and <tt/poll()/ system calls work by waiting forI/O events on a set of file descriptors. Squid only checks for<em/read/ and <em/write/ events. Squid knows that it shouldcheck for reading or writing when thereis a read or write handler registered for a given file descriptor. Handler functions are registered with the <tt/commSetSelect/ function.For example:<verb> commSetSelect(fd, COMM_SELECT_READ, clientReadRequest, conn, 0);</verb>In this example, <em/fd/ is a TCP socket to a client connection.When there is data to be read from the socket, then the select loopwill execute<verb> clientReadRequest(fd, conn);</verb><P>The I/O handlers are reset every time they are called. In other words,a handler function must re-register itself with <tt/commSetSelect/if it wants to continue reading or writing on a file descriptor.The I/O handler may be canceled before being called by providingNULL arguments, e.g.:<verb> commSetSelect(fd, COMM_SELECT_READ, NULL, NULL, 0);</verb><P>These I/O handlers (and others) and their associated callback datapointers are saved in the <em/fde/ data structure:<verb> struct _fde { ... PF *read_handler; void *read_data; PF *write_handler; void *write_data; close_handler *close_handler; DEFER *defer_check; void *defer_data; };</verb><em/read_handler/ and <em/write_handler/ are called when the filedescriptor is ready for reading or writing, respectively. The <em/close_handler/ is called when the filedescriptoris closed. The <em/close_handler/ is actually a linked listof callback functions to be called.<P>In some situations we want to defer reading from a filedescriptor,even though it has data for us to read. This may be the casewhen data arrives from the server-side faster than it can be written to the client-side.Before adding a filedescriptor to the ``read set'' for select,we call <em/defer_check/ (if it is non-NULL). If <em/defer_check/returns 1, then we skip the filedescriptor for that time throughthe select loop.<P>These handlers are stored in the <em/FD_ENTRY/ structure as defined in<tt/comm.h/. <tt/fd_table[]/ is the global array of <em/FD_ENTRY/structures. The handler functions are of type <em/PF/, which is atypedef:<verb> typedef void (*PF) (int, void *);</verb>The close handler is really a linked list of handler functions.Each handler also has an associated pointer <tt/(void *data)/ tosome kind of data structure.<P><tt/comm_select()/ is the function which issues the select() systemcall. It scans the entire <tt/fd_table[]/ array looking for handlerfunctions. Each file descriptor with a read handler will be set inthe <tt/fd_set/ read bitmask. Similarly, write handlers are scanned andbits set for the write bitmask. <tt/select()/ is then called, and thereturn read and write bitmasks are scanned for descriptors with pendingI/O. For each ready descriptor, the handler is called. Note thatthe handler is cleared from the <em/FD_ENTRY/ before it is called.<P>After each handler is called, <tt/comm_select_incoming()/ iscalled to process new HTTP and ICP requests.<P>Typical read handlers are<tt/httpReadReply()/,<tt/diskHandleRead()/,<tt/icpHandleUdp()/,and <tt/ipcache_dnsHandleRead()/.Typical write handlers are<tt/commHandleWrite()/,<tt/diskHandleWrite()/,and <tt/icpUdpReply()/.The handler function is set with <tt/commSetSelect()/, with theexception of the close handlers, which are set with<tt/comm_add_close_handler()/.<P>The close handlers are normally called from <tt/comm_close()/. The job of the close handlers is to deallocate data structures associated with the file descriptor. For this reason <tt/comm_close()/must normally be the last function in a sequence to prevent accessingjust-freed memory.<P>The timeout and lifetime handlers are called for file descriptors whichhave been idle for too long. They are further discussed in a following chapter.<!-- %%%% Chapter : CLIENT REQUEST PROCESSING %%%% --><sect>Processing Client Requests<!-- %%%% Chapter : STORAGE MANAGER %%%% --><sect>Storage Manager<!-- %%%% Chapter : FORWARDING SELECTION %%%% -->
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -