nqnfs.me

来自「早期freebsd实现」· ME 代码 · 共 1,623 行 · 第 1/5 页
1,623 行
.\" Copyright (c) 1993 The Usenix Association. All rights reserved..\".\" This document is derived from software contributed to Berkeley by.\" Rick Macklem at The University of Guelph with the permission of.\" the Usenix Association..\".\" Redistribution and use in source and binary forms, with or without.\" modification, are permitted provided that the following conditions.\" are met:.\" 1. Redistributions of source code must retain the above copyright.\"    notice, this list of conditions and the following disclaimer..\" 2. Redistributions in binary form must reproduce the above copyright.\"    notice, this list of conditions and the following disclaimer in the.\"    documentation and/or other materials provided with the distribution..\" 3. All advertising materials mentioning features or use of this software.\"    must display the following acknowledgement:.\"	This product includes software developed by the University of.\"	California, Berkeley and its contributors..\" 4. Neither the name of the University nor the names of its contributors.\"    may be used to endorse or promote products derived from this software.\"    without specific prior written permission..\".\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION).\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF.\" SUCH DAMAGE..\".\"	@(#)nqnfs.me	8.1 (Berkeley) 4/20/94.\".lp.nr PS 12.ps 12Reprinted with permission from the "Proceedings of the Winter 1994 UsenixConference", January 1994, San Francisco, CA, Copyright The UsenixAssociation..nr PS 14.ps 14.sp.ce\fBNot Quite NFS, Soft Cache Consistency for NFS\fR.nr PS 12.ps 12.sp.ce\fIRick Macklem\fR.ce\fIUniversity of Guelph\fR.sp.nr PS 12.ps 12.ce\fBAbstract\fR.nr PS 10.ps 10.ppThere are some constraints inherent in the NFS\(tm\(mo protocolthat result in performance limitationsfor high performanceworkstation environments.This paper discusses an NFS-like protocol named Not Quite NFS (NQNFS),designed to address some of these limitations.This protocol provides full cache consistency during normaloperation, while permitting more effective client-side caching in aneffort to improve performance.There are also a variety of minor protocol changes, in order to resolvevarious NFS issues.The emphasis is on observed performance of apreliminary implementation of the protocol, in order to showhow well this design worksand to suggest possible areas for further improvement..sh 1 "Introduction".ppIt has been observed thatoverall workstation performance has not been scaling withprocessor speed and that file system I/O is a limiting factor [Ousterhout90].Ousterhoutnotesthat a principal challenge for operating system developers is thedecoupling of system calls from their underlying I/O operations, in orderto improve average system call response times.For distributed file systems, every synchronous Remote Procedure Call (RPC)takes a minimum of a few milliseconds and, as such, is analogous to anunderlying I/O operation.This suggests that client caching with a very goodhit ratio for read type operations, along with asynchronous writing, is required in order to avoid delays waiting for RPC replies.However, the NFS protocol requires that the server be stateless\**.(f\**The server must not require any state that may be lost due to a crash, tofunction correctly..)fand does not provide any explicit mechanism for client cacheconsistency, puttingconstraints on how the client may cache data.This paper describes an NFS-like protocol that includes a cache consistencycomponent designed to enhance client caching performance. It does providefull consistency under normal operation, but without requiring that hardstate information be maintained on the server.Design tradeoffs were made towards simplicity andhigh performance over cache consistency under abnormal conditions.The protocol design uses a variation of Leases [Gray89]to provide state on the server that does not need to be recovered after acrash..ppThe protocol also includes changes designed to address other limitationsof NFS in a modern workstation environment.The use of TCP transport is optionally available to avoidthe pitfalls of Sun RPC over UDP transport when running across an internetwork [Nowicki89].Kerberos [Steiner88] support is availableto do proper user authentication, in order to provide improved security andarbitrary client to server user ID mappings.There are also a variety of other changes to accommodate large file systems,such as 64bit file sizes and offsets, as well as lifting the 8Kbyte I/O sizelimit.The remainder of this paper gives an overview of the protocol, highlightingperformance related components, followed by an evaluation of resultant performancefor the 4.4BSD implementation..sh 1 "Distributed File Systems and Caching".ppClients using distributed file systems cache recently-used data in orderto reduce the number of synchronous server operations, and therefore improveaverage response times for system calls.Unfortunately, maintaining consistency between these caches is a problemwhenever write sharing occurs; that is, when a process on a client writesto a file and one or more processes on other client(s) read the file.If the writer closes the file before any reader(s) open the file for reading,this is called sequential write sharing. Both the Andrew ITC file system[Howard88] and NFS [Sandberg85] maintain consistency for sequential writesharing by requiring the writer to push all the writes through to theserver on close and having readers check to see if the file has beenmodified upon open. If the file has been modified, the client throws awayall cached data for that file, as it is now stale.NFS implementations typically detect file modification by checking a cachedcopy of the file's modification time; since this cached value is oftenseveral seconds out of date and only has a resolution of one second, an NFSclient often uses stale cached data for some time after the file hasbeen updated on the server..ppA more difficult case is concurrent write sharing, where write operations are intermixedwith read operations.Consistency for this case, often referred to as "full cache consistency,"requires that a reader always receives the most recently written data.Neither NFS nor the Andrew ITC file system maintain consistency for thiscase.The simplest mechanism for maintaining full cache consistency is the oneused by Sprite [Nelson88], which disables all client caching of thefile whenever concurrent write sharing might occur.There are other mechanisms described in the literature [Kent87a,Burrows88], but they appeared to be too elaborate for incorporationinto NQNFS (for example, Kent's requires specialized hardware).NQNFS differs from Sprite in the way itdetects write sharing. The Sprite server maintains a list of files currently openby the various clients and detects write sharing when a file open requestfor writing is received and the file is already open for reading(or vice versa).This list of open files is hard state information that must be recoveredafter a server crash, which is a significant problem in its ownright [Mogul93, Welch90]..ppThe approach used by NQNFS is a variant of the Leases mechanism [Gray89].In this model, the server issues to a client a promise, referred to as a"lease," that the client may cache a specific object without fear ofconflict.A lease has a limited duration and must be renewed by the client if itwishes to continue to cache the object.In NQNFS, clients hold short-term (up to one minute) leases on filesfor reading or writing.The leases are analogous to entries in the open file list, except thatthey expire after the lease term unless renewed by the client.As such, one minute after issuing the last lease there are no currentleases and therefore no lease records to be recovered after a crash, hencethe term "soft server state.".ppA related design consideration is the way client writing is done.Synchronous writing requires that all writes be pushed through to the serverduring the write system call.This is the simplest variant, from a consistency point of view, since theserver always has the most recently written data. It also permits any writeerrors, such as "file system out of space" to be propagated back to theclient's process via the write system call return.Unfortunately this approach limits the client write rate, based on server writeperformance and client/server RPC round trip time (RTT)..ppAn alternative to this is delayed writing, where the write system call returnsas soon as the data is cached on the client and the data is written to theserver sometime later.This permits client writing to occur at the rate of local storage accessup to the size of the local cache.Also, for cases where file truncation/deletion occurs shortly after writing,the write to the server may be avoided since the data has already beendeleted, reducing server write load.There are some obvious drawbacks to this approach.For any Sprite-like system to maintainfull consistency, the server must "callback" to the client to cause thedelayed writes to be written back to the server when write sharing is about tooccur.There are also problems with the propagation of errorsback to the client process that issued the write system call.The reason for this is thatthe system call has already returned without reporting an error and theprocess may also have already terminated.As well, there is a risk of the loss of recently written data if the clientcrashes before the data is written back to the server..ppA compromise between these two alternatives is asynchronous writing, wherethe write to the server is initiated during the write system call but the write systemcall returns before the write completes.This approach minimizes the risk of data loss due to a client crash, but negatesthe possibility of reducing server write load by throwing writes away whena file is truncated or deleted..ppNFS implementations usually do a mix of asynchronous and delayed writingbut push all writes to the server upon close, in order to maintain open/closeconsistency.Pushing the delayed writes on closenegates much of the performance advantage of delayed writing, since thedelays that were avoided in the write system calls are observed in the closesystem call.Akin to Sprite, the NQNFS protocol does delayed writing in an effort to achievegood client performance and uses a callback mechanism to maintain full cacheconsistency..sh 1 "Related Work".ppThere has been a great deal of effort put into improving the performance andconsistency of the NFS protocol. This work can be put in two categories.The first category are implementation enhancements for the NFS protocol andthe second involve modifications to the protocol..ppThe work done on implementation enhancements have attacked two problem areas,NFS server write performance and RPC transport problems.Server write performance is a major problem for NFS, in part due to therequirement to push all writes to the server upon close and in part dueto the fact that, for writes, all data and meta-data must be committed tonon-volatile storage before the server replies to the write RPC.The Prestoserve\(tm\(dg[Moran90]system uses non-volatile RAM as a buffer for recently written data on the server,so that the write RPC replies can be returned to the client before the data is written to thedisk surface.Write gathering [Juszczak94] is a software technique used on the server where a writeRPC request is delayed for a short time in the hope that another contiguouswrite request will arrive, so that they can be merged into one write operation.Since the replies to all of the merged writes are not returned to the client until the writeoperation is completed, this delay does not violate the protocol.When write operations are merged, the number of disk writes can be reduced,improving server write performance.Although either of the above reduces write RPC response time for the server,it cannot be reduced to zero, and so, any client side caching mechanismthat reduces write RPC load or client dependence on server RPC response timeshould still improve overall performance.Good client side caching should be complementary to these server techniques,although client performance improvements as a result of caching may be lessdramatic when these techniques are used..ppIn NFS, each Sun RPC request is packaged in a UDP datagram for transmissionto the server. A timer is started, and if a timeout occurs before the correspondingRPC reply is received, the RPC request is retransmitted.There are two problems with this model.First, when a retransmit timeout occurs, the RPC may be redone, instead ofsimply retransmitting the RPC request message to the server. A recent-requestcache can be used on the server to minimize the negative impact of redoingRPCs [Juszczak89].The second problem is that a large UDP datagram, such as a read request orwrite reply, must be fragmented by IP and if any one IP fragment is lost intransit, the entire UDP datagram is lost [Kent87]. Since entire requests and repliesare packaged in a single UDP datagram, this puts an upper bound on the read/writedata size (8 kbytes)..ppAdjusting the retransmit timeout (RTT) interval dynamically and applying acongestion window on outstanding requests has been shown to be of some help[Nowicki89] with the retransmission problem.An alternative to this is to use TCP transport to delivery the RPC messagesreliably [Macklem90] and one of the performance results in this papershows the effects of this further..ppSrinivasan and Mogul [Srinivasan89] enhanced the NFS protocol to use the Sprite cacheconsistency algorithm in an effort to improve performance and to providefull client cache consistency.This experimental implementation demonstrated significantly betterperformance than NFS, but suffered from a lack of crash recovery support.The NQNFS protocol design borrowed heavily from this work, but differedfrom the Sprite algorithm by using Leases instead of file open stateto detect write sharing.The decision to use Leases was made primarily to avoid the crash recoveryproblem.More recent work by the Sprite group [Baker91] and Mogul [Mogul93] haveaddressed the crash recovery problem, making this design tradeoff morequestionable now..ppSun has recently updated the NFS protocol to Version 3 [SUN93], using somechanges similar to NQNFS to address various issues. The Version 3 protocoluses 64bit file sizes and offsets, provides a Readdir_and_Lookup RPC andan access RPC.It also provides cache hints, to permit a client to be able to determinewhether a file modification is the result of that client's write or someother client's write.It would be possible to add either Spritely NFS or NQNFS support for cacheconsistency to the NFS Version 3 protocol..sh 1 "NQNFS Consistency Protocol and Recovery".ppThe NQNFS cache consistency protocol uses a somewhat Sprite-like [Nelson88]mechanism, but is based on Leases [Gray89] instead of hard server state informationabout open files.The basic principle is that the server disables client caching of files wheneverconcurrent write sharing could occur, by performing a server-to-clientcallback,forcing the client to flush its caches and to do all subsequent I/O on the file withsynchronous RPCs.A Sprite server maintains a record of the open state of files forall clients and uses this to determine when concurrent write sharing mightoccur.This \fIopen state\fR information might also be referred to as an infinite-termlease for the file, with explicit lease cancellation.NQNFS, on the other hand, uses a short-term lease that expires due to timeoutafter a maximum of one minute, unless explicitly renewed by the client.The fundamental difference is that an NQNFS client must keep renewinga lease to use cached data whereas a Sprite client assumes the data is valid until canceledby the serveror the file is closed.
nqnfs.me - 源码说明

本页面展示了「早期freebsd实现」中的 nqnfs.me 源码文件，采用 ME 编程语言编写，共 1,623 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与freebsd相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?