⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 00000046.htm

📁 一份很好的linux入门资料
💻 HTM
📖 第 1 页 / 共 5 页
字号:
<HTML><HEAD>  <TITLE>BBS水木清华站∶精华区</TITLE></HEAD><BODY><CENTER><H1>BBS水木清华站∶精华区</H1></CENTER>发信人:&nbsp;clamor&nbsp;(clamor),&nbsp;信区:&nbsp;Linux&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<BR>标&nbsp;&nbsp;题:&nbsp;Linux&nbsp;Kernel&nbsp;Internals-3(VFS)&nbsp;<BR>发信站:&nbsp;BBS&nbsp;水木清华站&nbsp;(Tue&nbsp;Dec&nbsp;19&nbsp;21:35:14&nbsp;2000)&nbsp;<BR>&nbsp;<BR>3.&nbsp;Virtual&nbsp;Filesystem&nbsp;(VFS)&nbsp;<BR>3.1&nbsp;Inode&nbsp;Caches&nbsp;and&nbsp;Interaction&nbsp;with&nbsp;Dcache&nbsp;<BR>In&nbsp;order&nbsp;to&nbsp;support&nbsp;multiple&nbsp;filesystems&nbsp;Linux&nbsp;contains&nbsp;a&nbsp;special&nbsp;kernel&nbsp;int&nbsp;<BR>erface&nbsp;level&nbsp;called&nbsp;VFS&nbsp;-&nbsp;Virtual&nbsp;Filesystem&nbsp;Switch.&nbsp;This&nbsp;is&nbsp;similar&nbsp;to&nbsp;vnod&nbsp;<BR>e/vfs&nbsp;interface&nbsp;found&nbsp;in&nbsp;SVR4&nbsp;derivatives&nbsp;(originally&nbsp;it&nbsp;came&nbsp;from&nbsp;BSD&nbsp;and&nbsp;S&nbsp;<BR>un&nbsp;original&nbsp;implementations).&nbsp;<BR>Linux&nbsp;inode&nbsp;cache&nbsp;is&nbsp;implemented&nbsp;in&nbsp;a&nbsp;single&nbsp;file&nbsp;fs/inode.c&nbsp;which&nbsp;consists&nbsp;&nbsp;<BR>of&nbsp;977&nbsp;lines&nbsp;of&nbsp;code.&nbsp;It&nbsp;is&nbsp;interesting&nbsp;to&nbsp;note&nbsp;that&nbsp;for&nbsp;the&nbsp;last&nbsp;5-7&nbsp;years&nbsp;&nbsp;<BR>not&nbsp;many&nbsp;changes&nbsp;were&nbsp;made&nbsp;to&nbsp;it,&nbsp;i.e.&nbsp;one&nbsp;can&nbsp;still&nbsp;recognize&nbsp;some&nbsp;of&nbsp;the&nbsp;c&nbsp;<BR>ode&nbsp;comparing&nbsp;the&nbsp;latest&nbsp;version&nbsp;with,&nbsp;say,&nbsp;1.3.42.&nbsp;<BR>The&nbsp;structure&nbsp;of&nbsp;Linux&nbsp;inode&nbsp;cache&nbsp;is&nbsp;as&nbsp;follows:&nbsp;<BR>1.&nbsp;A&nbsp;global&nbsp;hashtable&nbsp;inode_hashtable,&nbsp;each&nbsp;inode&nbsp;is&nbsp;hashed&nbsp;by&nbsp;the&nbsp;value&nbsp;of&nbsp;&nbsp;<BR>the&nbsp;superblock&nbsp;pointer&nbsp;and&nbsp;32bit&nbsp;inode&nbsp;number.&nbsp;Inodes&nbsp;without&nbsp;superblock&nbsp;(in&nbsp;<BR>ode-i_sb&nbsp;==&nbsp;NULL)&nbsp;are&nbsp;added&nbsp;to&nbsp;a&nbsp;doubly&nbsp;linked&nbsp;list&nbsp;headed&nbsp;by&nbsp;anon_hash_chai&nbsp;<BR>n&nbsp;instead.&nbsp;Examples&nbsp;of&nbsp;anonymous&nbsp;inodes&nbsp;are&nbsp;sockets&nbsp;created&nbsp;by&nbsp;net/socket.c:&nbsp;<BR>sock_alloc()&nbsp;by&nbsp;calling&nbsp;fs/inode.c:get_empty_inode()&nbsp;<BR>2.&nbsp;A&nbsp;global&nbsp;type&nbsp;in_use&nbsp;list&nbsp;(inode_in_use)&nbsp;which&nbsp;contains&nbsp;valid&nbsp;inodes&nbsp;with&nbsp;<BR>&nbsp;i_count0,&nbsp;i_nlink0.&nbsp;Inodes&nbsp;newly&nbsp;allocated&nbsp;by&nbsp;get_empty_inode()&nbsp;and&nbsp;get_new&nbsp;<BR>_inode()&nbsp;are&nbsp;added&nbsp;to&nbsp;inode_in_use&nbsp;list&nbsp;<BR>3.&nbsp;A&nbsp;global&nbsp;type&nbsp;unused&nbsp;list&nbsp;(inode_unused)&nbsp;which&nbsp;contains&nbsp;valid&nbsp;inodes&nbsp;with&nbsp;<BR>&nbsp;i_count&nbsp;=&nbsp;0&nbsp;<BR>4.&nbsp;A&nbsp;per-superblock&nbsp;type&nbsp;dirty&nbsp;list&nbsp;(sb-s_dirty)&nbsp;which&nbsp;contains&nbsp;valid&nbsp;inodes&nbsp;<BR>&nbsp;with&nbsp;i_count0,&nbsp;i_nlink0&nbsp;and&nbsp;i_state&nbsp;&amp;&nbsp;I_DIRTY.&nbsp;When&nbsp;inode&nbsp;is&nbsp;marked&nbsp;dirty&nbsp;i&nbsp;<BR>t&nbsp;is&nbsp;added&nbsp;to&nbsp;the&nbsp;sb-s_dirty&nbsp;list&nbsp;if&nbsp;it&nbsp;is&nbsp;also&nbsp;hashed.&nbsp;Maintaining&nbsp;a&nbsp;per-su&nbsp;<BR>perblock&nbsp;dirty&nbsp;list&nbsp;of&nbsp;inodes&nbsp;allows&nbsp;to&nbsp;quickly&nbsp;sync&nbsp;inodes&nbsp;<BR>5.&nbsp;Inode&nbsp;cache&nbsp;proper&nbsp;-&nbsp;a&nbsp;SLAB&nbsp;cache&nbsp;called&nbsp;inode_cachep.&nbsp;As&nbsp;inode&nbsp;objects&nbsp;a&nbsp;<BR>re&nbsp;allocated&nbsp;and&nbsp;freed,&nbsp;they&nbsp;are&nbsp;taken&nbsp;from&nbsp;and&nbsp;returned&nbsp;to&nbsp;this&nbsp;SLAB&nbsp;cache&nbsp;<BR>The&nbsp;type&nbsp;lists&nbsp;are&nbsp;anchored&nbsp;from&nbsp;inode-i_list,&nbsp;the&nbsp;hashtable&nbsp;from&nbsp;inode-i_ha&nbsp;<BR>sh.&nbsp;Each&nbsp;inode&nbsp;can&nbsp;be&nbsp;on&nbsp;a&nbsp;hashtable&nbsp;and&nbsp;one&nbsp;and&nbsp;only&nbsp;one&nbsp;type&nbsp;(in_use,&nbsp;unus&nbsp;<BR>ed&nbsp;or&nbsp;dirty)&nbsp;list.&nbsp;<BR>All&nbsp;these&nbsp;lists&nbsp;are&nbsp;protected&nbsp;by&nbsp;a&nbsp;single&nbsp;spinlock&nbsp;-&nbsp;inode_lock.&nbsp;<BR>Inode&nbsp;cache&nbsp;subsystem&nbsp;is&nbsp;initialised&nbsp;when&nbsp;inode_init()&nbsp;function&nbsp;is&nbsp;called&nbsp;in&nbsp;<BR>it/main.c:start_kernel().&nbsp;The&nbsp;function&nbsp;is&nbsp;marked&nbsp;as&nbsp;__init&nbsp;which&nbsp;means&nbsp;its&nbsp;c&nbsp;<BR>ode&nbsp;is&nbsp;thrown&nbsp;away&nbsp;later&nbsp;on.&nbsp;It&nbsp;is&nbsp;passed&nbsp;a&nbsp;single&nbsp;argument&nbsp;-&nbsp;the&nbsp;number&nbsp;of&nbsp;&nbsp;<BR>physical&nbsp;pages&nbsp;on&nbsp;the&nbsp;system.&nbsp;This&nbsp;is&nbsp;so&nbsp;that&nbsp;inode&nbsp;cache&nbsp;can&nbsp;configure&nbsp;itse&nbsp;<BR>lf&nbsp;depending&nbsp;on&nbsp;how&nbsp;much&nbsp;memory&nbsp;is&nbsp;available,&nbsp;i.e.&nbsp;create&nbsp;a&nbsp;larger&nbsp;hashtable&nbsp;<BR>&nbsp;if&nbsp;there&nbsp;is&nbsp;enough&nbsp;memory.&nbsp;<BR>The&nbsp;only&nbsp;stats&nbsp;information&nbsp;about&nbsp;inode&nbsp;cache&nbsp;is&nbsp;the&nbsp;number&nbsp;of&nbsp;unused&nbsp;inodes,&nbsp;<BR>&nbsp;stored&nbsp;in&nbsp;inodes_stat.nr_unused&nbsp;and&nbsp;accessible&nbsp;to&nbsp;user&nbsp;programs&nbsp;via&nbsp;files&nbsp;/&nbsp;<BR>proc/sys/fs/inode-nr&nbsp;and&nbsp;/proc/sys/fs/inode-state.&nbsp;<BR>We&nbsp;can&nbsp;examine&nbsp;one&nbsp;of&nbsp;the&nbsp;lists&nbsp;from&nbsp;the&nbsp;gdb&nbsp;running&nbsp;on&nbsp;a&nbsp;live&nbsp;kernel&nbsp;thus:&nbsp;<BR>(gdb)&nbsp;printf&nbsp;&quot;%d\n&quot;,&nbsp;(unsigned&nbsp;long)(&amp;((struct&nbsp;inode&nbsp;*)0)-i_list)&nbsp;<BR>8&nbsp;<BR>(gdb)&nbsp;p&nbsp;inode_unused&nbsp;<BR>$34&nbsp;=&nbsp;0xdfa992a8&nbsp;<BR>(gdb)&nbsp;p&nbsp;(struct&nbsp;list_head)inode_unused&nbsp;<BR>$35&nbsp;=&nbsp;{next&nbsp;=&nbsp;0xdfa992a8,&nbsp;prev&nbsp;=&nbsp;0xdfcdd5a8}&nbsp;<BR>(gdb)&nbsp;p&nbsp;((struct&nbsp;list_head)inode_unused).prev&nbsp;<BR>$36&nbsp;=&nbsp;(struct&nbsp;list_head&nbsp;*)&nbsp;0xdfcdd5a8&nbsp;<BR>(gdb)&nbsp;p&nbsp;(((struct&nbsp;list_head)inode_unused).prev)-prev&nbsp;<BR>$37&nbsp;=&nbsp;(struct&nbsp;list_head&nbsp;*)&nbsp;0xdfb5a2e8&nbsp;<BR>(gdb)&nbsp;set&nbsp;$i&nbsp;=&nbsp;(struct&nbsp;inode&nbsp;*)0xdfb5a2e0&nbsp;<BR>(gdb)&nbsp;p&nbsp;$i-i_ino&nbsp;<BR>$38&nbsp;=&nbsp;0x3bec7&nbsp;<BR>(gdb)&nbsp;p&nbsp;$i-i_count&nbsp;<BR>$39&nbsp;=&nbsp;{counter&nbsp;=&nbsp;0x0}&nbsp;<BR>Note&nbsp;that&nbsp;we&nbsp;deducted&nbsp;8&nbsp;from&nbsp;the&nbsp;address&nbsp;0xdfb5a2e8&nbsp;to&nbsp;obtain&nbsp;the&nbsp;address&nbsp;of&nbsp;<BR>&nbsp;the&nbsp;'struct&nbsp;inode'&nbsp;0xdfb5a2e0&nbsp;according&nbsp;to&nbsp;the&nbsp;definition&nbsp;of&nbsp;list_entry()&nbsp;m&nbsp;<BR>acro&nbsp;from&nbsp;include/linux/list.h.&nbsp;<BR>To&nbsp;understand&nbsp;how&nbsp;inode&nbsp;cache&nbsp;works&nbsp;let&nbsp;us&nbsp;trace&nbsp;a&nbsp;lifetime&nbsp;of&nbsp;an&nbsp;inode&nbsp;of&nbsp;a&nbsp;<BR>&nbsp;regular&nbsp;file&nbsp;on&nbsp;ext2&nbsp;filesystem&nbsp;as&nbsp;it&nbsp;is&nbsp;opened&nbsp;and&nbsp;closed:&nbsp;<BR>fd&nbsp;=&nbsp;open(&quot;file&quot;,&nbsp;O_RDONLY);&nbsp;<BR>close(fd);&nbsp;<BR>The&nbsp;open(2)&nbsp;system&nbsp;call&nbsp;is&nbsp;implemented&nbsp;in&nbsp;fs/open.c:sys_open&nbsp;function&nbsp;and&nbsp;th&nbsp;<BR>e&nbsp;real&nbsp;work&nbsp;is&nbsp;done&nbsp;by&nbsp;fs/open.c:filp_open()&nbsp;function&nbsp;which&nbsp;is&nbsp;split&nbsp;into&nbsp;tw&nbsp;<BR>o&nbsp;parts:&nbsp;<BR>1.&nbsp;open_namei()&nbsp;-&nbsp;fills&nbsp;in&nbsp;nameidata&nbsp;structure&nbsp;containing&nbsp;the&nbsp;dentry&nbsp;and&nbsp;vfs&nbsp;<BR>mount&nbsp;structures&nbsp;<BR>2.&nbsp;dentry_open()&nbsp;-&nbsp;given&nbsp;a&nbsp;dentry&nbsp;and&nbsp;vfsmount&nbsp;it&nbsp;allocates&nbsp;a&nbsp;new&nbsp;'struct&nbsp;fi&nbsp;<BR>le'&nbsp;and&nbsp;links&nbsp;them&nbsp;together,&nbsp;as&nbsp;well&nbsp;as&nbsp;invoking&nbsp;filesystem&nbsp;specific&nbsp;f_op-op&nbsp;<BR>en()&nbsp;method&nbsp;which&nbsp;was&nbsp;set&nbsp;in&nbsp;inode-i_fop&nbsp;when&nbsp;inode&nbsp;was&nbsp;read&nbsp;in&nbsp;open_namei()&nbsp;<BR>&nbsp;(which&nbsp;provided&nbsp;inode&nbsp;via&nbsp;dentry-d_inode).&nbsp;<BR>The&nbsp;open_namei()&nbsp;function&nbsp;interacts&nbsp;with&nbsp;dentry&nbsp;cache&nbsp;via&nbsp;path_walk()&nbsp;which&nbsp;&nbsp;<BR>in&nbsp;turn&nbsp;calls&nbsp;real_lookup()&nbsp;which&nbsp;invokes&nbsp;inode_operations-lookup()&nbsp;method&nbsp;w&nbsp;<BR>hich&nbsp;is&nbsp;filesystem-specific&nbsp;and&nbsp;its&nbsp;job&nbsp;is&nbsp;to&nbsp;find&nbsp;the&nbsp;entry&nbsp;in&nbsp;the&nbsp;parent&nbsp;d&nbsp;<BR>irectory&nbsp;with&nbsp;the&nbsp;matching&nbsp;name&nbsp;and&nbsp;then&nbsp;do&nbsp;iget(sb,&nbsp;ino)&nbsp;to&nbsp;get&nbsp;the&nbsp;corresp&nbsp;<BR>onding&nbsp;inode&nbsp;which&nbsp;brings&nbsp;us&nbsp;to&nbsp;the&nbsp;inode&nbsp;cache.&nbsp;When&nbsp;the&nbsp;inode&nbsp;is&nbsp;read&nbsp;in,&nbsp;&nbsp;<BR>the&nbsp;dentry&nbsp;is&nbsp;instantiated&nbsp;by&nbsp;means&nbsp;of&nbsp;d_add(dentry,&nbsp;inode).&nbsp;While&nbsp;we&nbsp;are&nbsp;at&nbsp;<BR>&nbsp;it,&nbsp;note&nbsp;that&nbsp;for&nbsp;UNIX-style&nbsp;filesystems&nbsp;which&nbsp;have&nbsp;the&nbsp;concept&nbsp;of&nbsp;on-disk&nbsp;&nbsp;<BR>inode&nbsp;number,&nbsp;it&nbsp;is&nbsp;the&nbsp;lookup&nbsp;method's&nbsp;job&nbsp;to&nbsp;map&nbsp;its&nbsp;endianness&nbsp;to&nbsp;current&nbsp;<BR>&nbsp;cpu&nbsp;format,&nbsp;e.g.&nbsp;if&nbsp;the&nbsp;inode&nbsp;number&nbsp;in&nbsp;raw&nbsp;(fs-specific)&nbsp;dir&nbsp;entry&nbsp;is&nbsp;in&nbsp;l&nbsp;<BR>ittle-endian&nbsp;32&nbsp;bit&nbsp;format&nbsp;one&nbsp;could&nbsp;do:&nbsp;<BR>unsigned&nbsp;long&nbsp;ino&nbsp;=&nbsp;le32_to_cpu(de-inode);&nbsp;<BR>inode&nbsp;=&nbsp;iget(sb,&nbsp;ino);&nbsp;<BR>d_add(dentry,&nbsp;inode);&nbsp;<BR>So,&nbsp;when&nbsp;we&nbsp;open&nbsp;a&nbsp;file&nbsp;we&nbsp;hit&nbsp;iget(sb,&nbsp;ino)&nbsp;which&nbsp;is&nbsp;really&nbsp;iget4(sb,&nbsp;ino,&nbsp;&nbsp;<BR>NULL,&nbsp;NULL)&nbsp;which&nbsp;does:&nbsp;<BR>1.&nbsp;Attempts&nbsp;to&nbsp;find&nbsp;an&nbsp;inode&nbsp;with&nbsp;matching&nbsp;superblock&nbsp;and&nbsp;inode&nbsp;number&nbsp;in&nbsp;th&nbsp;<BR>e&nbsp;hashtable&nbsp;under&nbsp;protection&nbsp;of&nbsp;inode_lock.&nbsp;If&nbsp;inode&nbsp;is&nbsp;found&nbsp;then&nbsp;it's&nbsp;refe&nbsp;<BR>rence&nbsp;count&nbsp;(i_count)&nbsp;is&nbsp;incremented&nbsp;and&nbsp;if&nbsp;and&nbsp;if&nbsp;it&nbsp;was&nbsp;0&nbsp;and&nbsp;inode&nbsp;is&nbsp;not&nbsp;<BR>&nbsp;dirty&nbsp;then&nbsp;inode&nbsp;is&nbsp;removed&nbsp;from&nbsp;whatever&nbsp;type&nbsp;list&nbsp;(inode-i_list)&nbsp;it&nbsp;is&nbsp;cu&nbsp;<BR>rrently&nbsp;on&nbsp;(it&nbsp;has&nbsp;to&nbsp;be&nbsp;inode_unused&nbsp;list,&nbsp;of&nbsp;course)&nbsp;and&nbsp;inserted&nbsp;into&nbsp;ino&nbsp;<BR>de_in_use&nbsp;type&nbsp;list&nbsp;and&nbsp;inodes_stat.nr_unused&nbsp;is&nbsp;decremented&nbsp;<BR>2.&nbsp;If&nbsp;inode&nbsp;is&nbsp;currently&nbsp;locked&nbsp;we&nbsp;wait&nbsp;until&nbsp;it&nbsp;is&nbsp;not&nbsp;locked&nbsp;so&nbsp;that&nbsp;iget4&nbsp;<BR>()&nbsp;is&nbsp;guaranteed&nbsp;to&nbsp;return&nbsp;not&nbsp;locked&nbsp;inode&nbsp;<BR>3.&nbsp;If&nbsp;inode&nbsp;was&nbsp;not&nbsp;found&nbsp;in&nbsp;the&nbsp;hashtable&nbsp;then&nbsp;it&nbsp;is&nbsp;the&nbsp;first&nbsp;time&nbsp;we&nbsp;enco&nbsp;<BR>unter&nbsp;this&nbsp;inode&nbsp;so&nbsp;we&nbsp;call&nbsp;get_new_inode()&nbsp;passing&nbsp;it&nbsp;the&nbsp;pointer&nbsp;to&nbsp;the&nbsp;pl&nbsp;<BR>ace&nbsp;in&nbsp;the&nbsp;hashtable&nbsp;where&nbsp;it&nbsp;should&nbsp;be&nbsp;inserted&nbsp;to&nbsp;<BR>4.&nbsp;get_new_inode()&nbsp;allocates&nbsp;a&nbsp;new&nbsp;inode&nbsp;from&nbsp;the&nbsp;inode_cachep&nbsp;SLAB&nbsp;cache&nbsp;bu&nbsp;<BR>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -