📄 vfs.txt
字号:
The Linux Kernel Virtual File System (VFS) Reference
====================================================
Based on kernel version 2.1.61
First Edition
November 3, 1997
Written by
Andrew E. Mileski
Abstract
--------
The Linux Kernel Virtual File System is simple in concept, but complex
in implementation, which makes it difficult for the first time file
system developer to quickly understand. Having the kernel source code
at hand makes it possible to learn all of the kernel's secrets, but
this can consume much valuable development time. This document strives
to flatten the normally steep learning curve into a gentle slope, so
that a file system developer can spend less time studying the kernel
source code, and more time on programming. Even so, this document is
only a reference, and not a how-to manual.
Table of Contents
-----------------
0. File System Type Structure
0.1 name
0.2 fs_flags
0.3 read_super
0.4 next
1. Super blocks
1.1 Super block Structure
1.1.1 s_dev
1.1.2 s_blocksize
1.1.3 s_blocksize_bits
1.1.4 s_lock
1.1.5 s_rd_only
1.1.6 s_dirt
1.1.7 s_type
1.1.8 s_op
1.1.9 dq_op
1.1.10 s_flags
1.1.11 s_magic
1.1.12 s_time
1.1.13 s_root
1.1.14 s_wait
1.1.15 s_ibasket
1.1.16 s_ibasket_count
1.1.17 s_ibasket_max
1.1.18 s_dirty
1.1.19 u
1.2 Super block Operations Structure
1.2.1 read_inode
1.2.2 write_inode
1.2.3 put_inode
1.2.4 delete_inode
1.2.5 notify_change
1.2.6 put_super
1.2.7 write_super
1.2.8 statfs
1.2.9 remount_fs
2. Inodes
2.1 Inode Structure
2.1.1 i_hash
2.1.2 i_list
2.1.3 i_ino
2.1.4 i_dev
2.1.5 i_count
2.1.6 i_mode
2.1.7 i_nlink
2.1.8 i_uid
2.1.9 i_gid
2.1.10 i_rdev
2.1.11 i_size
2.1.12 i_atime
2.1.13 i_mtime
2.1.14 i_ctime
2.1.15 i_blksize
2.1.16 i_blocks
2.1.17 i_version
2.1.18 i_nrpages
2.1.19 i_sem
2.1.20 i_op
2.1.21 i_sb
2.1.22 i_wait
2.1.23 i_flock
2.1.24 i_mmap
2.1.25 i_pages
2.1.26 i_dquot
2.1.27 i_state
2.1.28 i_flags
2.1.29 i_pipe
2.1.30 i_sock
2.1.31 i_writecount
2.1.32 i_attr_flags
2.1.33 u
2.2 Inode Operations Structure
2.2.1 default_file_ops
2.2.2 create
2.2.3 lookup
2.2.4 link
2.2.5 unlink
2.2.6 symlink
2.2.7 mkdir
2.2.8 rmdir
2.2.9 mknod
2.2.10 rename
2.2.11 readlink
2.2.12 follow_link
2.2.13 readpage
2.2.14 writepage
2.2.15 bmap
2.2.16 truncate
2.2.17 permission
2.2.18 smap
2.2.19 updatepage
2.2.20 revalidate
2.3 The Inode Cache
3. Files
3.1 File Structure
3.1.1 f_next
3.1.2 f_pprev
3.1.3 f_dentry
3.1.4 f_op
3.1.5 f_mode
3.1.6 f_pos
3.1.7 f_count
3.1.8 f_flags
3.1.9 f_reada
3.1.10 f_ramax
3.1.11 f_raend
3.1.12 f_ralen
3.1.13 f_rawin
3.1.14 f_owner
3.1.15 f_version
3.1.16 private_data
3.2 File Operations Structure
3.2.1 llseek
3.2.2 read
3.2.3 write
3.2.4 readdir
3.2.5 poll
3.2.6 ioctl
3.2.7 mmap
3.2.8 open
3.2.9 release
3.2.10 fsync
3.2.11 fasync
3.2.12 check_media_change
3.2.13 revalidate
3.2.14 lock
3.3 The File Table
4. The Directory Entry Cache
4.1 Dentry Structure
4.1.1 d_count
4.1.2 d_flags
4.1.3 d_inode
4.1.4 d_parent
4.1.5 d_mounts
4.1.6 d_covers
4.1.7 d_hash
4.1.8 d_lru
4.1.9 d_name
4.1.10 d_time
4.1.11 d_op
4.1.12 d_sb
4.2 Dentry Operations Structure
4.2.1 d_revalidate
4.2.2 d_hash
4.2.3 d_compare
4.2.4 d_delete
4.3 Quick Strings
4.3.1 Quick String Structure
4.3.1.1 name
4.3.1.2 len
4.3.1.3 hash
4.3.2 Hashing a Quick String
5. The Buffer Cache
5.1 The Buffer Head Structure
5.1.1 b_next
5.1.2 b_blocknr
5.1.3 b_size
5.1.4 b_dev
5.1.5 b_rdev
5.1.6 b_rsector
5.1.7 b_this_page
5.1.8 b_state
5.1.9 b_next_free
5.1.10 b_count
5.1.11 b_data
5.1.12 b_list
5.1.13 b_flushtime
5.1.14 b_lru_time
5.1.15 b_wait
5.1.16 b_pprev
5.1.17 b_prev_free
5.1.18 b_reqnext
5.2 Reading a Buffer
5.3 Writing a Buffer
4. Regular files
5. Directories
6. Links
7. Symbolic links
8. Block Devices
9. Character Devices
10. Memory Mapped Files
11. Quotas
11.1 Quota Operations Structure
11.1.1 initialize
11.1.2 drop
11.1.3 alloc_block
11.1.4 alloc_inode
11.1.5 free_block
11.1.6 free_inode
11.1.7 transfer
12. The VFS Character Set
12.1 Encoding with UTF-8.
Appendix A List of Kernel Filesystem Routines
Appendix B Linux on the Internet
-----------------
Preface
-------
This document is a result of many hours of studying the Linux kernel
source code by the author, in an attempt to implement a new file
system type [OSTA-UDF(tm) to be exact]. Like all tools, this document
was born out of frustration with the lack of documention available on
the subject. A few seasoned Linux kernel developers claim the kernel
source code is "self documenting", which the author of this document
strongly disagrees with.
Though every attempt has been made to ensure the correctness of this
document, the author cannot make any guarantees as to its accuracy.
The Linux kernel source code is constantly being revised, debugged,
and upgraded by volunteer programmers from around the globe. This
particular document is based on the 2.1.61 experimental kernel, which
was the latest available at the time of writing.
0. File System Type Structure
------------------------------
The following definition of the file system type structure can be found
in the header file linux/include/linux/fs.h
struct file_system_type {
const char *name;
int fs_flags;
struct super_block *(*read_super) (struct super_block *,
void *, int);
struct file_system_type * next;
};
This structure is registered with the kernel to add support for the
new file system type.
0.1 name
---------
Definition:
const char *name;
Purpose:
Pointer to a name string for the file system type.
Description:
The name string is used to locate the file system driver to use
when mounting.
For convenience, the name should be short (8 characters or less
is reasonable), and should only contain printable lower case
letters [a-z] and numbers [0-9].
The following are reserved:
affs Amiga Fast File System
afs Andrew File System
autofs AUTO-mounter File System
coherent Coherent
ext EXTended (obsolete)
ext2 Second EXTended (Linux native)
hpfs High Performance File System (OS/2)
iso9660 ISO 9660 compliant
minix Minix
msdos MS-DOS (FAT12 and FAT16)
ncpfs Novell File System
nfs Network File System
proc PROCess
romfs ROM File System
smbfs SamBA File System
sysv SYSV (and SYSV2, SYSV4)
udf Universal Disk Format
ufs
umsdos Use MS-DOS
vfat Versatile FAT (extended FAT16, FAT32)
xenix Xenix
Status:
Required.
0.2 fs_flags
-------------
Definition:
int fs_flags;
Purpose:
File system flags.
Description:
The following flags can be combined with a bitwise OR:
FS_REQUIRES_DEV File system requires a block device.
FS_NO_DCACHE Only dcache the necessary things.
FS_NO_PRELIM Prevent preloading of dentries, even if
FS_NO_DCACHE is not set.
FS_IBASKET File system does callback to
free_ibasket() if space gets low.
Status:
Required.
0.3 read_super
---------------
Definition:
struct super_block *(*read_super) (struct super_block *,
void *, int);
Purpose:
Read the super block from a device.
Description:
Refer to section 1. on super blocks.
Status:
Required.
1. Super Block
---------------
A super block contains information about a file system as a whole.
There is one super block for every mounted file system. The number
of system wide super blocks, and hence the maximum number of file
systems that can be mounted simultaneously, is controlled by the
NR_SUPER define found in include/linux/fs.h (default of 64).
1.1 Super Block Structure
--------------------------
The following definition of the super block structure can be found in
the header file linux/include/linux/fs.h
struct super_block {
kdev_t s_dev;
unsigned long s_blocksize;
unsigned char s_blocksize_bits;
unsigned char s_lock;
unsigned char s_rd_only;
unsigned char s_dirt;
struct file_system_type *s_type;
struct super_operations *s_op;
struct dquot_operations *dq_op;
unsigned long s_flags;
unsigned long s_magic;
unsigned long s_time;
struct dentry *s_root;
struct wait_queue *s_wait;
struct inode *s_ibasket;
short int s_ibasket_count;
short int s_ibasket_max;
struct list_head s_dirty;
union {
struct minix_sb_info minix_sb;
struct ext2_sb_info ext2_sb;
struct hpfs_sb_info hpfs_sb;
struct msdos_sb_info msdos_sb;
struct isofs_sb_info isofs_sb;
struct nfs_sb_info nfs_sb;
struct sysv_sb_info sysv_sb;
struct affs_sb_info affs_sb;
struct ufs_sb_info ufs_sb;
struct romfs_sb_info romfs_sb;
struct smb_sb_info smbfs_sb;
void *generic_sbp;
} u;
};
1.1.1 s_dev
------------
Definition:
kdev_t s_dev;
Purpose:
The primary device that the filesystem resides on.
Description:
All operations that the file system performs use the device
defined by this field, or directly derived from this field.
An unused super block has this field set to NODEV.
The kdev_t type is defined in linux/include/linux/kdev_t.h
Note: If the file system resides on more than on device, the
specification of the other devices is ouside the scope of the
VFS, and is up to the file system itself to keep track of.
Status:
Required. The kernel sets the value of this field.
1.1.2 s_blocksize
------------------
Definition:
unsigned long s_blocksize;
Purpose:
The size in bytes of the buffers used by the device the file
system resides on.
Description:
The kernel supports block sizes of 512, 1024, 2048, 4096, and
8192 bytes.
Notice that the block size is always a power of two, an even
multiple of BLOCK_SIZE, and fits onto a single page of memory
(PAGE_SIZE bytes long - either 4096 or 8192 depending on the
the architecture).
Since bus mastering devices use DMA transfers on a buffer, the
block size must be the same or larger than the physical sector
size.
get_hard_blocksize() and set_blocksize() are both defined in
linux/fs/buffer.c
Status:
1.1.3 s_blocksize_bits
-----------------------
Definition:
unsigned char s_blocksize_bits;
Purpose:
The number of bits in the block size.
Description:
Since the block size is always a power of two [refer to
s_blocksize in section 1.1.2], this field is the log base 2
of the block size.
This field exists to make fast binary operations on the block
size possible, like division using shifts, or creating a bit
mask.
Status:
1.1.4 s_lock
-------------
Definition:
unsigned char s_lock;
Purpose:
This field is used for a simple super block locking mechanism.
Description:
Setting this field to a non-zero value (normally 1), means the
super block is locked. A value of zero means it is unlocked.
Any process waiting on a locked super block will be put on the
waiting list [refer to s_wait in section 1.1.14], and have its
state changed to uninterruptible until the super block becomes
unlocked.
Anytime a super block is to be changed, it must first be locked
by a call to lock_super(), and then unlocked with a call to
unlock_super() after the changes are complete. Both of these
super block locking routines are declared in the header file
linux/include/linux/locks.h
Also refer to s_time in section 1.1.12 for the time of the last
super block change.
Status:
Required. The lock_super() and unlock_super() routines are the
only routines that should access this field.
1.1.5 s_rd_only
----------------
Definition:
unsigned char s_rd_only;
Purpose:
Description:
Status:
1.1.6 s_dirt
-------------
Definition:
unsigned char s_dirt;
Purpose:
The dirty flag for the super block.
Description:
When s_dirt is non-zero (normally 1), the super block is
considered dirty, and will be processed the next time the super
block is synced.
Refer to write_super() in section ?.??.?
Status:
Required. Can be kept zero if super block is never synced.
1.1.7 s_type
-------------
Definition:
struct file_system_type *s_type;
Purpose:
The pointer to the file system type structure for the file
system.
Description:
The file system type structure contains a pointer to the
read_super() routine [refer to section ?.?.?].
Refer to section ?.?.? for details on the file system type
structure.
Status:
Required. The kernel fills in this field for the file system.
1.1.8 s_op
-----------
Definition:
struct super_operations *s_op;
Purpose:
Pointer to the super block operations structure for the super
block.
Description:
Refer to section ?.?.? for details on the structure.
Status:
Optional. May be NULL, though this is a useless file system.
1.1.9 dq_op
------------
Definition:
struct dquot_operations *dq_op;
Purpose:
Pointer to the quota operations structure for the file system.
Description:
Refer to section ?.?.? for details on the structure.
Status:
Optional. May be null if quotas are not supported.
1.1.10 s_flags
---------------
Definition:
unsigned long s_flags;
Purpose:
File system independent mount flags.
Description:
The sys_mount() call passes these flags to the file system.
An inode normally inherits these [refer to i_flags in section
?.?.?].
Flag Meaning
---- -------
MS_RDONLY Mount file system read-only.
MS_NOSUID Ignore suid and sgid bits.
MS_NODEV Disallow access to device special files.
MS_NOEXEC Disallow program execution.
MS_SYNCHRONOUS Writes are synced at once.
MS_REMOUNT Alter flags of a mounted file system.
MS_MANDLOCK Allow mandatory locks on an file system.
MS_NOATIME Do not update access times.
MS_RMT_MASK Flags that can be altered by MS_REMOUNT.
MS_RDONLY MS_NOSUID MS_NODEV MS_NOEXEC
MS_SYNCHRONOUS MS_MANDLOCK MS_NOATIME
When doing a re-mount operation, a change of flags is indicated
by a bitwise OR of the flags with MS_MGC_VAL. A bitwise AND of
the flags with MS_MGC_MASK will give only the flags.
Status:
Required.
1.1.11 s_magic
---------------
Definition:
unsigned long s_magic;
Purpose:
A unique number that can be used to identify the super block.
Description:
The magic number is for the use of the file system. This field
is not refered to by the kernel.
Status:
Optional. Should be set to some useful value though.
1.1.12 s_time
--------------
Definition:
unsigned long s_time;
Purpose:
Time of the last super block change.
Description:
This field should be updated whenever the super block is
changed. This makes version recognition possible.
Status:
Optional. Should be set to the time of mount if not used.
1.1.13 s_root
--------------
Definition:
struct dentry *s_root;
Purpose:
Pointer to the root dentry structure.
Description:
Refer to section ?.??.?? for details on the dentry structure.
Status:
Required. Must be valid for a mounted file system.
1.1.14 s_wait
--------------
Definition:
struct wait_queue *s_wait;
Purpose:
Pointer to the queue of processes waiting on the super block.
Description:
Refer to s_lock in section 1.1.4 for details on locking.
Status:
Required. This field is maintained by the kernel.
1.1.15 s_ibasket
-----------------
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -