📄 vfs.txt

📁 linux 内核源代码
💻 TXT
📖 第 1 页 / 共 3 页
字号:
上一页 1 23
	if write_begin was called with the AOP_FLAG_UNINTERRUPTIBLE flag).        The filesystem must take care of unlocking the page and releasing it        refcount, and updating i_size.        Returns < 0 on failure, otherwise the number of bytes (<= 'copied')        that were able to be copied into pagecache.  bmap: called by the VFS to map a logical block offset within object to  	physical block number. This method is used by the FIBMAP  	ioctl and for working with swap-files.  To be able to swap to  	a file, the file must have a stable mapping to a block  	device.  The swap system does not go through the filesystem  	but instead uses bmap to find out where the blocks in the file  	are and uses those addresses directly.  invalidatepage: If a page has PagePrivate set, then invalidatepage        will be called when part or all of the page is to be removed	from the address space.  This generally corresponds to either a	truncation or a complete invalidation of the address space	(in the latter case 'offset' will always be 0).	Any private data associated with the page should be updated	to reflect this truncation.  If offset is 0, then	the private data should be released, because the page	must be able to be completely discarded.  This may be done by        calling the ->releasepage function, but in this case the        release MUST succeed.  releasepage: releasepage is called on PagePrivate pages to indicate        that the page should be freed if possible.  ->releasepage        should remove any private data from the page and clear the        PagePrivate flag.  It may also remove the page from the        address_space.  If this fails for some reason, it may indicate        failure with a 0 return value.	This is used in two distinct though related cases.  The first        is when the VM finds a clean page with no active users and        wants to make it a free page.  If ->releasepage succeeds, the        page will be removed from the address_space and become free.	The second case is when a request has been made to invalidate        some or all pages in an address_space.  This can happen        through the fadvice(POSIX_FADV_DONTNEED) system call or by the        filesystem explicitly requesting it as nfs and 9fs do (when        they believe the cache may be out of date with storage) by        calling invalidate_inode_pages2().	If the filesystem makes such a call, and needs to be certain        that all pages are invalidated, then its releasepage will        need to ensure this.  Possibly it can clear the PageUptodate        bit if it cannot free private data yet.  direct_IO: called by the generic read/write routines to perform        direct_IO - that is IO requests which bypass the page cache        and transfer data directly between the storage and the        application's address space.  get_xip_page: called by the VM to translate a block number to a page.	The page is valid until the corresponding filesystem is unmounted.	Filesystems that want to use execute-in-place (XIP) need to implement	it.  An example implementation can be found in fs/ext2/xip.c.  migrate_page:  This is used to compact the physical memory usage.        If the VM wants to relocate a page (maybe off a memory card        that is signalling imminent failure) it will pass a new page	and an old page to this function.  migrate_page should	transfer any private data across and update any references        that it has to the page.  launder_page: Called before freeing a page - it writes back the dirty page. To  	prevent redirtying the page, it is kept locked during the whole	operation.The File Object===============A file object represents a file opened by a process.struct file_operations----------------------This describes how the VFS can manipulate an open file. As of kernel2.6.22, the following members are defined:struct file_operations {	struct module *owner;	loff_t (*llseek) (struct file *, loff_t, int);	ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);	ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);	ssize_t (*aio_read) (struct kiocb *, const struct iovec *, unsigned long, loff_t);	ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t);	int (*readdir) (struct file *, void *, filldir_t);	unsigned int (*poll) (struct file *, struct poll_table_struct *);	int (*ioctl) (struct inode *, struct file *, unsigned int, unsigned long);	long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);	int (*mmap) (struct file *, struct vm_area_struct *);	int (*open) (struct inode *, struct file *);	int (*flush) (struct file *);	int (*release) (struct inode *, struct file *);	int (*fsync) (struct file *, struct dentry *, int datasync);	int (*aio_fsync) (struct kiocb *, int datasync);	int (*fasync) (int, struct file *, int);	int (*lock) (struct file *, int, struct file_lock *);	ssize_t (*readv) (struct file *, const struct iovec *, unsigned long, loff_t *);	ssize_t (*writev) (struct file *, const struct iovec *, unsigned long, loff_t *);	ssize_t (*sendfile) (struct file *, loff_t *, size_t, read_actor_t, void *);	ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int);	unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);	int (*check_flags)(int);	int (*dir_notify)(struct file *filp, unsigned long arg);	int (*flock) (struct file *, int, struct file_lock *);	ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, size_t, unsigned int);	ssize_t (*splice_read)(struct file *, struct pipe_inode_info *, size_t, unsigned int);};Again, all methods are called without any locks being held, unlessotherwise noted.  llseek: called when the VFS needs to move the file position index  read: called by read(2) and related system calls  aio_read: called by io_submit(2) and other asynchronous I/O operations  write: called by write(2) and related system calls  aio_write: called by io_submit(2) and other asynchronous I/O operations  readdir: called when the VFS needs to read the directory contents  poll: called by the VFS when a process wants to check if there is	activity on this file and (optionally) go to sleep until there	is activity. Called by the select(2) and poll(2) system calls  ioctl: called by the ioctl(2) system call  unlocked_ioctl: called by the ioctl(2) system call. Filesystems that do not  	require the BKL should use this method instead of the ioctl() above.  compat_ioctl: called by the ioctl(2) system call when 32 bit system calls 	 are used on 64 bit kernels.  mmap: called by the mmap(2) system call  open: called by the VFS when an inode should be opened. When the VFS	opens a file, it creates a new "struct file". It then calls the	open method for the newly allocated file structure. You might	think that the open method really belongs in	"struct inode_operations", and you may be right. I think it's	done the way it is because it makes filesystems simpler to	implement. The open() method is a good place to initialize the	"private_data" member in the file structure if you want to point	to a device structure  flush: called by the close(2) system call to flush a file  release: called when the last reference to an open file is closed  fsync: called by the fsync(2) system call  fasync: called by the fcntl(2) system call when asynchronous	(non-blocking) mode is enabled for a file  lock: called by the fcntl(2) system call for F_GETLK, F_SETLK, and F_SETLKW  	commands  readv: called by the readv(2) system call  writev: called by the writev(2) system call  sendfile: called by the sendfile(2) system call  get_unmapped_area: called by the mmap(2) system call  check_flags: called by the fcntl(2) system call for F_SETFL command  dir_notify: called by the fcntl(2) system call for F_NOTIFY command  flock: called by the flock(2) system call  splice_write: called by the VFS to splice data from a pipe to a file. This		method is used by the splice(2) system call  splice_read: called by the VFS to splice data from file to a pipe. This	       method is used by the splice(2) system callNote that the file operations are implemented by the specificfilesystem in which the inode resides. When opening a device node(character or block special) most filesystems will call specialsupport routines in the VFS which will locate the required devicedriver information. These support routines replace the filesystem fileoperations with those for the device driver, and then proceed to callthe new open() method for the file. This is how opening a device filein the filesystem eventually ends up calling the device driver open()method.Directory Entry Cache (dcache)==============================struct dentry_operations------------------------This describes how a filesystem can overload the standard dentryoperations. Dentries and the dcache are the domain of the VFS and theindividual filesystem implementations. Device drivers have no businesshere. These methods may be set to NULL, as they are either optional orthe VFS uses a default. As of kernel 2.6.22, the following members aredefined:struct dentry_operations {	int (*d_revalidate)(struct dentry *, struct nameidata *);	int (*d_hash) (struct dentry *, struct qstr *);	int (*d_compare) (struct dentry *, struct qstr *, struct qstr *);	int (*d_delete)(struct dentry *);	void (*d_release)(struct dentry *);	void (*d_iput)(struct dentry *, struct inode *);	char *(*d_dname)(struct dentry *, char *, int);};  d_revalidate: called when the VFS needs to revalidate a dentry. This	is called whenever a name look-up finds a dentry in the	dcache. Most filesystems leave this as NULL, because all their	dentries in the dcache are valid  d_hash: called when the VFS adds a dentry to the hash table  d_compare: called when a dentry should be compared with another  d_delete: called when the last reference to a dentry is	deleted. This means no-one is using the dentry, however it is	still valid and in the dcache  d_release: called when a dentry is really deallocated  d_iput: called when a dentry loses its inode (just prior to its	being deallocated). The default when this is NULL is that the	VFS calls iput(). If you define this method, you must call	iput() yourself  d_dname: called when the pathname of a dentry should be generated.	Usefull for some pseudo filesystems (sockfs, pipefs, ...) to delay	pathname generation. (Instead of doing it when dentry is created,	its done only when the path is needed.). Real filesystems probably	dont want to use it, because their dentries are present in global	dcache hash, so their hash should be an invariant. As no lock is	held, d_dname() should not try to modify the dentry itself, unless	appropriate SMP safety is used. CAUTION : d_path() logic is quite	tricky. The correct way to return for example "Hello" is to put it	at the end of the buffer, and returns a pointer to the first char.	dynamic_dname() helper function is provided to take care of this.Example :static char *pipefs_dname(struct dentry *dent, char *buffer, int buflen){	return dynamic_dname(dentry, buffer, buflen, "pipe:[%lu]",				dentry->d_inode->i_ino);}Each dentry has a pointer to its parent dentry, as well as a hash listof child dentries. Child dentries are basically like files in adirectory.Directory Entry Cache API--------------------------There are a number of functions defined which permit a filesystem tomanipulate dentries:  dget: open a new handle for an existing dentry (this just increments	the usage count)  dput: close a handle for a dentry (decrements the usage count). If	the usage count drops to 0, the "d_delete" method is called	and the dentry is placed on the unused list if the dentry is	still in its parents hash list. Putting the dentry on the	unused list just means that if the system needs some RAM, it	goes through the unused list of dentries and deallocates them.	If the dentry has already been unhashed and the usage count	drops to 0, in this case the dentry is deallocated after the	"d_delete" method is called  d_drop: this unhashes a dentry from its parents hash list. A	subsequent call to dput() will deallocate the dentry if its	usage count drops to 0  d_delete: delete a dentry. If there are no other open references to	the dentry then the dentry is turned into a negative dentry	(the d_iput() method is called). If there are other	references, then d_drop() is called instead  d_add: add a dentry to its parents hash list and then calls	d_instantiate()  d_instantiate: add a dentry to the alias hash list for the inode and	updates the "d_inode" member. The "i_count" member in the	inode structure should be set/incremented. If the inode	pointer is NULL, the dentry is called a "negative	dentry". This function is commonly called when an inode is	created for an existing negative dentry  d_lookup: look up a dentry given its parent and path name component	It looks up the child of that given name from the dcache	hash table. If it is found, the reference count is incremented	and the dentry is returned. The caller must use d_put()	to free the dentry when it finishes using it.For further information on dentry locking, please refer to the documentDocumentation/filesystems/dentry-locking.txt.Resources=========(Note some of these resources are not up-to-date with the latest kernel version.)Creating Linux virtual filesystems. 2002    <http://lwn.net/Articles/13325/>The Linux Virtual File-system Layer by Neil Brown. 1999    <http://www.cse.unsw.edu.au/~neilb/oss/linux-commentary/vfs.html>A tour of the Linux VFS by Michael K. Johnson. 1996    <http://www.tldp.org/LDP/khg/HyperNews/get/fs/vfstour.html>A small trail through the Linux kernel by Andries Brouwer. 2001    <http://www.win.tue.nl/~aeb/linux/vfs/trail.html>
上一页 1 23
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -