webcache.tex

来自「柯老师网站上找到的」· TEX 代码 · 共 1,107 行 · 第 1/3 页
TEX
1,107 行
get-size \tup{pageid} & return the size of \$pageid. \\get-cachetime \tup{pageid} & return the time when page \$pageid is enteredinto the cache. \\\end{alist}\subsection{Debugging}\label{sec:webcache-debug}HttpApp provides two debugging methods. \code{log} registers a file handle as the trace file for all HttpApp-specific traces. Its trace format is described in section \ref{sec:webcache-trace}. \code{evTrace} logs a particular event into trace file. It concatenatestime and the id of the HttpApp to the given string, and writes it out. Details can be found in \ns/webcache/http.cc.\section{Representing web pages}We represent web pages as the abstract class Page. It is defined as follows:\begin{program}class Page {public:        Page(int size) : size_(size) {}        int size() const { return size_; }        int& id() { return id_; }        virtual WebPageType type() const = 0;protected:        int size_;        int id_;};\end{program}It represents the basic properties of a web page: size and URL. Uponit we derive two classes of web pages: ServerPage and ClientPage. Theformer contains a list of page modification times, and is supposed toby used by servers. It was originally designed to work with a specialweb server trace; currently it is not widely used in \ns. The latter,ClientPage, is the default web page for all page pools below. A ClientPage has the following major properties (we omit somevariables used by web cache with invalidation, which has too manydetails to be covered here):\begin{itemize}\item \code{HttpApp* server_} - Pointer to the original server of this  page. \item \code{double age_} - Lifetime of the page.\item \code{int status_} - Status of the page. Its contents are  explained below.\end{itemize}The status (32-bit) of a ClientPage is separated into two 16-bitparts. The first part (with mask 0x00FF) is used to store pagestatus, the second part (with mask 0xFF00) is used to store expectedpage actions to be performed by cache. Available page status are (again,we omit those closely related to web cache invalidation):\begin{alist}HTTP\_VALID\_PAGE & Page is valid. \\HTTP\_UNCACHEABLE & Page is uncacheable. This option can be used tosimulate CGI pages or dynamic server pages. \\\end{alist}CilentPage has the following major C++ methods: \begin{itemize}\item \code{type()} - Returns the type of the page. Assuming pages of  the same type should have identical operations, we let all  ClientPage to be of type ``HTML''. If later on other types of web  pages are needed, a class may be derived from ClientPage (or Page)  with the desired type. \item \code{name(char *buf)} - Print the page's name into the given  buffer. A page's name is in the format of:  \tup{ServerName}:\tup{PageID}. \item \code{split_name(const char *name, PageID& id)} - Split a given  page name into its two components. This is a static method. \item \code{mtime()} - Returns the last modification time of the page.\item \code{age()} - Returns the lifetime of the page. \end{itemize}\section{Page pools}\label{sec:webcache-pagepool}PagePool and its derived classes are used by servers to generate pageinformation (name, size, modification time, lifetime, etc.), by cachesto describe which pages are in storage, and by clients to generate arequest stream. Figure~\ref{fig:pagepool-hier} provides an overview ofthe class hierarchy here. \begin{figure}[tb]  \begin{center}    \includegraphics{pagepool-hier}    \caption{Class hierarchy of page pools}    \label{fig:pagepool-hier}  \end{center}\end{figure}Among these, class PagePool/Client is mostly used by caches to storepages and other cache-related information; other three classes areused by servers and clients. In the following we describe theseclasses one by one.\subsection{PagePool/Math}This is the simplest type of page pool. It has only one page, whosesize can be generated by a given random variable. Page modificationsequence and request sequence are generated using two given random variables. It has the following OTclmethods:\begin{alist}gen-pageid & Returns the page ID which will be requested next. Because it has only one page, it always returns 0.\\gen-size & Returns the size of the page. It can be generated by a  given random variable. \\gen-modtime \tup{pageID} \tup{mt} & Returns the next modification time of the  page. \tup{mt} gives the last modification time. It uses the  lifetime random variable. \\ranvar-age \tup{rv} & Set the file lifetime random variable as  \tup{rv}. \\ranvar-size \tup{rv} & Set the file size random variable to be  \tup{rv}. \\\end{alist}{\em NOTE}: There are two ways to generate a request sequence. Withall page pools except PagePool/ProxyTrace, request sequence isgenerated with a random variable which describes the requestinterval, and the \code{gen-pageid} method of other page pools givesthe page ID of the next request. PagePool/ProxyTrace loads the requeststream during initialization phase, so it does not need a randomvariable for request interval; see its description below. An example of using PagePool/Math is at Section\ref{sec:webcache-example}. That script is also available at \ns/tcl/ex/simple-webcache.tcl. \subsection{PagePool/CompMath}It improves over PagePool/Math by introducing a compound pagemodel. By a compound page we mean a page which consists of a main textpage and a number of embedded objects, e.g., GIFs. We model a compoundpage as a main page and several component objects. The main page isalways assigned with ID 0. All component pageshave the same size; both the main page size and component object size isfixed, but adjustable through OTcl-bound variables \code{main_size_}and \code{comp_size_}, respectively. The number of component objectscan be set using the OTcl-bound variable \code{num_pages_}.PagePool/CompMath has the following major OTcl methods:\begin{alist}gen-size \tup{pageID} & If \tup{pageID} is 0, return  \code{main\_size\_}, otherwise return \code{comp\_size\_}.\\ranvar-main-age \tup{rv} & Set random variable for main page  lifetime. Another one, \code{ranvar-obj-age}, set that for component  objects. \\gen-pageid & Always returns 0, which is the main page ID. \\ gen-modtime \tup{pageID} \tup{mt} & Returns the next modification time  of the given page \tup{pageID}. If the given ID is 0, it uses the  main page lifetime random variable; otherwise it uses the component  object lifetime random variable. \\\end{alist}An example of using PagePool/CompMath is available at \ns/tcl/ex/simple-webcache-comp.tcl.\subsection{PagePool/ProxyTrace}The above two page pool synthesize request stream to a single web pageby two random variables: one for request interval, another forrequested page ID. Sometimes users may want more complicated requeststream, which consists of multiple pages and exhibits spatial localityand temporal locality. There exists one proposal (SURGE\cite{Barf98:WebWorkload}) which generates such request streams, we choose to provide analternative solution: use real web proxy cache trace (or servertrace). The class PagePool/ProxyTrace uses real traces to drivesimulation. Because there exist many web traces with differentformats, they should be converted into a intermediate format beforefed into this page pool. The converter is available at http://mash.cs.berkeley.edu/dist/vint/webcache-trace-conv.tar.gz.It accepts four trace formats: DEC proxy trace (1996), UCBHome-IP trace, NLANR proxy trace, and EPA web server trace. Itconverts a given trace into two files: pglog and reqlog. Each line inpglog has the following format:\begin{center}\begin{verbatim}[<serverID> <URL_ID> <PageSize> <AccessCount>]\end{verbatim}\end{center}Each line, except the last line, in reqlog has the following format:\begin{center}\begin{verbatim}[<time> <clientID> <serverID> <URL_ID>]\end{verbatim}\end{center}The last line in reqlog records the duration of the entire trace andthe total number of unique URLs:\begin{center}\begin{verbatim}i <Duration> <Number_of_URL>\end{verbatim}\end{center}PagePool/ProxyTrace takes these two file as input, and use them todrive simulation. Because most existing web proxy traces do notcontain complete page modification information, we choose to use abimodal page modification model \cite{Cao97:CacheConsistency}. Weallow user to select $x\%$ of the pages to have one random pagemodification interval generator, and the rest of the pages to haveanother generator. In this way, it's possible to let $x\%$ pages to bedynamic, i.e., modified frequently, and the rest static. Hot pages areevenly distributed among all pages. For example, assume 10\% pages aredynamic, then if we sort pages into a list according to their popularity,then pages 0, 10, 20, $\ldots$ are dynamic, rest are static. Becauseof this selection mechanism, we only allow bimodal ratio to change inthe unit of 10\%. In order to distribute requests to different requestors in thesimulator, PagePool/ProxyTrace maps the client ID in the traces torequestors in the simulator using a modulo operation. PagePool/ProxyTrace has the following major OTcl methods:\begin{alist}get-poolsize & Returns the total number of pages. \\get-duration & Returns the duration of the trace. \\bimodal-ratio & Returns the bimodal ratio. \\set-client-num \tup{num} & Set the number of requestors in thesimulation. \\gen-request \tup{ClientID} & Generate the next request for the givenrequestor. \\gen-size \tup{PageID} & Returns the size of the given page. \\bimodal-ratio \tup{ratio} & Set the dynamic pages to be \tup{ratio}*10percent. Note that this ratio changes in unit of 10\%. \\ranvar-dp \tup{ranvar} & Set page modification interval generator for dynamic pages. Similarly, ranvar-sp \tup{ranvar} sets the generatorfor static pages. \\set-reqfile \tup{file} & Set request stream file, as discussedabove. \\set-pgfile \tup{file} & Set page information file, as discussedabove. \\gen-modtime \tup{PageID} \tup{LastModTime} & Generate nextmodification time for the given page. \\\end{alist}An example of using PagePool/ProxyTrace is available at \ns/tcl/ex/simple-webcache-trace.tcl. \subsection{PagePool/Client}The class PagePool/Client helps caches to keep track of pages residentin cache, and to store various cache-related information aboutpages. It is mostly implemented in C++, because it is mainly usedinternally and little functionality is needed by users. It has thefollowing major C++ methods:\begin{itemize}\item \code{get_page(const char* name)} - Returns a pointer to the  page with the given name. \item \code{add_page(const char *name, int size, double mt, double et, double age)} - Add a page with given size, last modification    time (mt), cache entry time (et), and page lifetime (age). \item \code{remove_page(const char* name)} - Remove a page from cache.\end{itemize}This page pool should support various cache replacement algorithms,however, it has not been implemented yet. \section{Web client}\label{sec:webcache-client}Class Http/Client models behavior of a simple web browser. Itgenerates a sequence of page requests, where request interval and page IDs are randomized. It's a pure OTcl class inherited from Http. Next we'll walk through its functionalities and usage.\paragraph{Creating a client}First of all, we create a client and connect it to a cache and a web server.Currently a client is only allowed to connect to a single cache, but it's allowed to connect to multiple servers. Note that this has to be called \emph{AFTER} the simulation starts (i.e., after \code{$ns run} %$is called).This remains true for all of the following methods and code examples of Http and its derived classes, unless explicitly said.\begin{program}        # Assuming $server is a configured Http/Server.         set client [new Http/Client $ns $node] \; client resides on this node;        $client connect $server \; connecting client to server;\end{program} %$\paragraph{Configuring request generation}For every request, Http/Client uses PagePool to generate a random pageID, and use a random variable to generate intervals between two consecutive requests:\footnote{Some PagePool,e.g., PagePool/Math, has only one page and therefore it always returns thesame page. Some other PagePool, e.g. PagePool/Trace, has multiple pages and needs a random variable to pick out a random page.} \begin{program}        $client set-page-generator $pgp \; attach a configured PagePool;        $client set-interval-generator $ranvar \; attach a random variable;\end{program}Here we assume that PagePools of Http/Client share the same set of pagesas PagePools of the server. Usually we simplify our simulation by lettingall clients and servers share the same PagePool, i.e., they have the sameset of pages. When there are multiple servers, or servers' PagePools are separated from those of clients', care must be taken to make sure that every client sees the same set of pages as the servers to which they areattached.\paragraph{Starting}After the above setup, starting requests is very simple:\begin{program}        $client start-session $cache $server \; assuming $cache is a configured Http/Cache;\end{program}\paragraph{OTcl interfaces}Following is a list of its OTcl methods (in addition to thoseinherited from Http). This is not a complete list. More details can befound in \ns/tcl/webcache/http-agent.tcl.\begin{alist}send-request \tup{server} \tup{type} \tup{pageid} \tup{args} & send a request of page \$pageid and type \$type to \$server. The only request type allowed for a client is GET. \$args has a format identicalto that of \$attributes described in \code{Http::enter-page}. \\start-session \tup{cache} \tup{server} & start sending requests of a random page to \$server via \$cache. \\start \tup{cache} \tup{server} & before sending requests, populate\$cache with all pages in the client's PagePool. This method is useful when assuming infinite-sized caches and we want to observe behaviors of cache consistency algorithms in steady state. \\
webcache.tex - 源码说明

本页面展示了「柯老师网站上找到的」中的 webcache.tex 源码文件，采用 TEX 编程语言编写，共 1,107 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与网站相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?