📄 webcache.tex
字号:
\chapter{Web cache as an application}\label{chap:webcache}All applications described above are ``virtual'' applications, in the sensethat they do not actually transfer their own data in the simulator; all that matter is the \emph{size} and the \emph{time} when data are transferred.Sometimes we may want applications to transfer their own data in simulations.One such example is web caching, where we want HTTP servers to send HTTP headers to caches and clients. These headers contain page modification time information and other caching directives, which are important for some cache consistency algorithms.In the following, we first describe general issues regardingtransmitting application-level data in \ns, then we discuss specialissues, as well as APIs, related to transmitting application datausing TCP as transport. We will then proceed to discuss the internaldesign of HTTP client, server, and proxy cache. \section{Using application-level data in \ns}\begin{figure}[tb] \begin{center} \centerline{\includegraphics{app-dataflow}} \caption{Examples of application-level data flow} \label{fig:app-dataflow} \end{center}\end{figure}In order to transmit application-level data in \ns, we provide a uniform structure to pass data among applications, and topass data from applications to transport agents (Figure\ref{fig:app-dataflow}). It has three major components: a representation of a uniform application-level data unit (ADU), a common interface to pass data between applications, and two mechanismsto pass data between applications and transport agents.\subsection{ADU} The functionality of an ADU is similar to that of a Packet. It needs topack user data into an array, which is then included in the user dataarea of an \ns packet by an Agent (this is not supported by currentAgents. User must derive new agents to accept user data fromapplications, or use an wrapper like TcpApp. We'll discuss thislater). Compared with Packet, ADU provides this functionality in a differentway. In Packet, a common area is allocated for all packet headers; anoffset is used to access different headers in this area. In ADU thisis not applicable, because some ADU allocates their space dynamicallyaccording the the availability of user data. For example, if we wantto deliver an OTcl script between applications, the size of the scriptis undetermined beforehand. Therefore, we choose a less efficient butmore flexible method. Each ADU defines its own data members, andprovides methods to serialize them (i.e., pack data into an array andextract them from an array). For example, in the abstract base classof all ADU, AppData, we have:\begin{program} class AppData \{ private: AppDataType type_; // ADU type public: struct hdr \{ AppDataType type_; \}; public: AppData(char* b) \{ assert(b != NULL); type_ = ((hdr *)b)->type_; \} virtual void pack(char* buf) const; \}\end{program}Here \code{pack(char* buf)} is used to write an AppData objectinto an array, and \code{AppData(char* b)} is used to build a newAppData from a ``serialized'' copy of the object in an array.When deriving new ADU from the base class, users may add more data,but at the same time a new \code{pack(char *b)} and a new constructor should be provided to write and read those new data members from anarray. For an example as how to derive from an ADU, look at \ns/webcache/http-aux.h.\subsection{Passing data between applications}The base class of Application, Process, allows applications to passdata or request data between each other. It is defined as follows:\begin{program} class Process \{ public: Process() : target_(0) \{\} inline Process*& target() \{ return target_; \} virtual void process_data(int size, char* data) = 0; virtual void send_data(int size, char* data = 0); protected: Process* target_; \};\end{program}Process enables Application to link together. %{\bf TBA}\subsection{Transmitting user data over UDP}Currently there are no supports in class Agent to transmit userdata. There are two ways to transmit serialized ADU through transportagents. First, for UDP agent (and all agents derived from there), wecan derive from class UDP and add a new method\code{send(int nbytes, char *userdata)} to pass user data fromApplication to Agent. To pass data from an Agent to an Application issomewhat trickier: each agent has a pointer to its attachedapplication, we dynamically cast this pointer to an AppConnector andthen call \code{AppConnector::process_data()}.As an example, we illustrate how class HttpInvalAgent isimplemented. It is based on UDP, and is inteded to deliver web cacheinvalidation messages (\ns/webcache/inval-agent.h). It is defined as:\begin{program} class HttpInvalAgent : public Agent \{ public: HttpInvalAgent(); virtual void recv(Packet *, Handler *); virtual void send(int realsize, AppData* data); protected: int off_inv_; \};\end{program}Here \code{recv(Packet*, Handler*)} overridden to extract user data,and a new \code{send(int, AppData*)} is provided to include user datain packetes. An application (HttpApp) is attached to an HttpInvalAgentusing \code{Agent::attachApp()} (a dynamic cast is needed). In\code{send()}, the following code is used to write user data fromAppData to the user data area in a packet:\begin{program} Packet *pkt = allocpkt(data->size()); hdr_inval *ih = (hdr_inval *)pkt->access(off_inv_); ih->size() = data->size(); char *p = (char *)pkt->accessdata(); data->pack(p);\end{program}In \code{recv()}, the following code is used to read user data frompacket and to deliver to the attached application:\begin{program} hdr_inval *ih = (hdr_inval *)pkt->access(off_inv_); ((HttpApp*)app_)->process_data(ih->size(), (char *)pkt->accessdata()); Packet::free(pkt);\end{program}\subsection{Transmitting user data over TCP}\label{sec:webcache-tcpapp}Transmitting user data using TCP is trickier than doing that over UDP,mainly because of TCP's reassembly queue is only available forFullTcp. We deal with this problem by abstracting a TCP connection asa FIFO pipe. As indicated in section \ref{sec:upcalls}, transmission of application datacan be implemented via agent upcalls. Assuming we are using TCP agents, all data are delivered in sequence, which means we can view the TCP connection as a FIFO pipe. We emulate user data transmission over TCPas follows. We first provide buffer for application data at the sender. Then we count the bytes received at the receiver. When the receiver has got all bytes of the current data transmission,it then gets the data directly from the sender. Class Application/TcpApp is used to implement this functionality.A TcpApp object contains a pointer to a transport agent, presumably eithera FullTcp or a SimpleTcp.\footnote{A SimpleTcp agent is used solely for web caching simulations. It is actually an UDP agent. It has neither error recovery nor flow/congestioncontrol. It doesn't do packet segmentation. Assuming a loss-free network and in-order packet delivery,SimpleTcp agent simplifies the trace files and hence aids the debugging of application protocols, which, in our case, is the web cache consistency protocol.}(Currently TcpApp doesn't support asymmetric TCP agents, i.e., sender isseparated from receiver). It provides the following OTcl interfaces:\begin{itemize}\item \code{connect}: Connecting another TcpApp to this one. This connection is bi-directional, i.e., only one call to \code{connect} is needed, and data can be sent in either direction. \item \code{send}: It takes two arguments: \code{(nbytes, str)}. \code{nbytes} is the ``nominal'' size of application data. \code{str} is application data in string form.\end{itemize}In order to send application data in binary form, TcpApp provides a virtual C++ method \code{send(int nbytes, int dsize, const char *data)}.In fact, this is the method used to implement the OTcl method \code{send}.Because it's difficult to deal with binary data in Tcl, no OTcl interfaceis provided to handle binary data. \code{nbytes} is the number of bytes to be transmitted, \code{dsize} is the actual size of the array \code{data}.TcpApp provides a C++ virtual method \code{process_data(int size, char*data)}to handle the received data. The default handling is to treat the data as a tcl script and evaluate the script. But it's easy to derive a classto provide other types of handling.Here is an example of using Application/TcpApp. A similar example is \code{Test/TcpApp-2node} in \ns/tcl/test/test-suite-webcache.tcl.First, we create FullTcp agents and connect them:\begin{program} set tcp1 [new Agent/TCP/FullTcp] set tcp2 [new Agent/TCP/FullTcp] # {\cf Set TCP parameters here, e.g., window_, iss_, \ldots} $ns attach-agent $n1 $tcp1 $ns attach-agent $n2 $tcp2 $ns connect $tcp1 $tcp2 $tcp2 listen\end{program}Then we Create TcpApps and connect them:\begin{program} set app1 [new Application/TcpApp $tcp1] set app2 [new Application/TcpApp $tcp2] $app1 connect $app2\end{program}Now we let \code{$app1} %$be sender and \code{$app2} %$ be receiver:\begin{program} $ns at 1.0 "$app1 send 100 \bs"$app2 app-recv 100\bs""\end{program} %$Where \code{app-recv} is defined as:\begin{program} Application/TcpApp instproc app-recv { size } { global ns puts "[$ns now] app2 receives data $size from app1" }\end{program}\subsection{Class hierarchy related to user data handling}We conclude this section by providing a hierarchy of classes involvedin this section (Figure \ref{fig:appdata-hier}).\begin{figure}[tb] \begin{center} \includegraphics{appdata-hier} \caption{Hierarchy of classes related to application-level data handling} \label{fig:appdata-hier} \end{center}\end{figure}\section{Overview of web cache classes}\label{sec:webcache-class}There are three major classes related to web cache, as it is in thereal world: client (browser), server, and cache. Because they share acommon feature, i.e., the HTTP protocol, they are derived from thesame base class \code{Http} (Name of OTcl class, it's called\code{HttpApp} in C++). For the following reasons, it's not a realApplication. First, an HTTP object (i.e., client/cache/server) maywant to maintain multiple concurrent HTTP connections, but anApplication contains only one \code{agent_}. Also, an HTTP objectneeds to transmit real data (e.g., HTTP header) and that's provided byTcpApp instead of any Agent. Therefore, we choose to use a standaloneclass derived from TclObject for common features of all HTTP objects,which are managing HTTP connections and a set of pages. In the restof the section, we'll discuss these functionalities of Http. In thenext three sections, we'll in turn describe HTTP client, cache andserver.\subsection{Managing HTTP connections}\label{sec:webcache-connection}Every HTTP connection is embodied as a TcpAppobject. Http maintains a hash of TcpApp objects, which are all of its active connections. It assumes that to any other Http, it has only one HTTP connection. It also allows dynamic establishment and teardown of connections. Only OTcl interface is provided for establishing,tearing down a connection and sending data through a connection.\paragraph{OTcl methods}Following is a list of OTcl interfaces related to connection management in Http objects:\begin{alist}id & return the id of the Http object, which is the id of the node the objectis attached to. \\get-cnc \tup{client} & return the TCP agent associated with \$client (Http object).\\is-connected \tup{server} & return 0 if not connected to \$server, 1 otherwise.\\send \tup{client} \tup{bytes} \tup{callback} & send \$bytes of data to \$client. When it's done, execute \$callback (a OTcl command). \\connect \tup{client} \tup{TCP} & associate a TCP agent with \$client (Http object). That agent will be used to send packets \emph{to} \$client. \\disconnect \tup{client} & delete the association of a TCP agent with \$client.Note that neither the TCP agent nor \$client is not deleted, only the association is deleted.\\\end{alist}\paragraph{Configuration parameter}By default, Http objects use Agent/SimpleTcp as transport agents(section \ref{sec:webcache-tcpapp}). They can also use Agent/FullTcpagents, which allows Http objects to operate in a lossy network.Class variable code{TRANSPORT\_} is used for this purpose. E.g.,\code{Http set TRANSPORT\_ FullTcp} tells all Http objects useFullTcp agents.This configuration should be done \emph{before} simulation starts, and it should not change during simulation, because FullTcp agents do not inter-operate with SimpleTcp agents.\subsection{Managing web pages}\label{sec:webcache-page}Http also provides OTcl interfaces to manage a set of pages. The real management of pages are handled by class \code{PagePool} and itssubclasses. Because different HTTP objects have different requirementsfor page management, we allow different PagePool subclasses to be attachedto different subclasses of Http class. Meanwhile, we exporta common set of PagePool interfaces to OTcl throughHttp. For example, a browser may use a PagePool only to generate a request stream, so its PagePool only needs to contain a list of URLs. Buta cache may want to store page size, last modification time of every page instead of a list of URLs. However, this separation is not clearcut in the current implementation. Page URLs are represented in the form of:\code{\tup{ServerName}:\tup{SequenceNumber}}where the {\tt ServerName} is the name of OTcl object, and every page in every server should have a unique {\tt SequenceNumber}. Page contents are ignored. Instead, every page contains several \emph{attributes}, which are represented in OTcl as a list of the following (\tup{name} \tup{value}) pairs: ``modtime \tup{val}'' (page modification time), ``size \tup{val}'' (page size), and ``age \tup{val}''\}The ordering of these pairs is not significant.Following is a list of related OTcl methods.\begin{alist}set-pagepool \tup{pagepool} & set page pool \\enter-page \tup{pageid} \tup{attributes} & add a page with id \$pageidinto pool. \$attributes is the attributes of \$pageid, as described above. \\get-page \tup{pageid} & return page attributes in the format described above. \\get-modtime \tup{pageid} & return the last modification time of the page \$pageid. \\exist-page \tup{pageid} & return 0 if \$pageid doesn't exist in this Http object, 1 otherwise. \\
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -