📄 kernel-locking.tmpl
字号:
reads are far more common that writes. If not, there is another approach you can use to reduce the time the lock is held: reference counts. </para> <para> In this approach, an object has an owner, who sets the reference count to one. Whenever you get a pointer to the object, you increment the reference count (a `get' operation). Whenever you relinquish a pointer, you decrement the reference count (a `put' operation). When the owner wants to destroy it, they mark it dead, and do a put. </para> <para> Whoever drops the reference count to zero (usually implemented with <function>atomic_dec_and_test()</function>) actually cleans up and frees the object. </para> <para> This means that you are guaranteed that the object won't vanish underneath you, even though you no longer have a lock for the collection. </para> <para> Here's some skeleton code: </para> <programlisting> void create_foo(struct foo *x) { atomic_set(&x->use, 1); spin_lock_bh(&list_lock); ... insert in list ... spin_unlock_bh(&list_lock); } struct foo *get_foo(int desc) { struct foo *ret; spin_lock_bh(&list_lock); ... find in list ... if (ret) atomic_inc(&ret->use); spin_unlock_bh(&list_lock); return ret; } void put_foo(struct foo *x) { if (atomic_dec_and_test(&x->use)) kfree(foo); } void destroy_foo(struct foo *x) { spin_lock_bh(&list_lock); ... remove from list ... spin_unlock_bh(&list_lock); put_foo(x); } </programlisting> <sect2 id="helpful-macros"> <title>Macros To Help You</title> <para> There are a set of debugging macros tucked inside <filename class=headerfile>include/linux/netfilter_ipv4/lockhelp.h</filename> and <filename class=headerfile>listhelp.h</filename>: these are very useful for ensuring that locks are held in the right places to protect infrastructure. </para> </sect2> </sect1> <sect1 id="sleeping-things"> <title>Things Which Sleep</title> <para> You can never call the following routines while holding a spinlock, as they may sleep. This also means you need to be in user context. </para> <itemizedlist> <listitem> <para> Accesses to <firstterm linkend="gloss-userspace">userspace</firstterm>: </para> <itemizedlist> <listitem> <para> <function>copy_from_user()</function> </para> </listitem> <listitem> <para> <function>copy_to_user()</function> </para> </listitem> <listitem> <para> <function>get_user()</function> </para> </listitem> <listitem> <para> <function> put_user()</function> </para> </listitem> </itemizedlist> </listitem> <listitem> <para> <function>kmalloc(GFP_KERNEL)</function> </para> </listitem> <listitem> <para> <function>down_interruptible()</function> and <function>down()</function> </para> <para> There is a <function>down_trylock()</function> which can be used inside interrupt context, as it will not sleep. <function>up()</function> will also never sleep. </para> </listitem> </itemizedlist> <para> <function>printk()</function> can be called in <emphasis>any</emphasis> context, interestingly enough. </para> </sect1> <sect1 id="sparc"> <title>The Fucked Up Sparc</title> <para> Alan Cox says <quote>the irq disable/enable is in the register window on a sparc</quote>. Andi Kleen says <quote>when you do restore_flags in a different function you mess up all the register windows</quote>. </para> <para> So never pass the flags word set by <function>spin_lock_irqsave()</function> and brethren to another function (unless it's declared <type>inline</type>. Usually no-one does this, but now you've been warned. Dave Miller can never do anything in a straightforward manner (I can say that, because I have pictures of him and a certain PowerPC maintainer in a compromising position). </para> </sect1> <sect1 id="racing-timers"> <title>Racing Timers: A Kernel Pastime</title> <para> Timers can produce their own special problems with races. Consider a collection of objects (list, hash, etc) where each object has a timer which is due to destroy it. </para> <para> If you want to destroy the entire collection (say on module removal), you might do the following: </para> <programlisting> /* THIS CODE BAD BAD BAD BAD: IF IT WAS ANY WORSE IT WOULD USE HUNGARIAN NOTATION */ spin_lock_bh(&list_lock); while (list) { struct foo *next = list->next; del_timer(&list->timer); kfree(list); list = next; } spin_unlock_bh(&list_lock); </programlisting> <para> Sooner or later, this will crash on SMP, because a timer can have just gone off before the <function>spin_lock_bh()</function>, and it will only get the lock after we <function>spin_unlock_bh()</function>, and then try to free the element (which has already been freed!). </para> <para> This can be avoided by checking the result of <function>del_timer()</function>: if it returns <returnvalue>1</returnvalue>, the timer has been deleted. If <returnvalue>0</returnvalue>, it means (in this case) that it is currently running, so we can do: </para> <programlisting> retry: spin_lock_bh(&list_lock); while (list) { struct foo *next = list->next; if (!del_timer(&list->timer)) { /* Give timer a chance to delete this */ spin_unlock_bh(&list_lock); goto retry; } kfree(list); list = next; } spin_unlock_bh(&list_lock); </programlisting> <para> Another common problem is deleting timers which restart themselves (by calling <function>add_timer()</function> at the end of their timer function). Because this is a fairly common case which is prone to races, you can put a call to <function>timer_exit()</function> at the very end of your timer function, and user <function>del_timer_sync()</function> (<filename class=headerfile>include/linux/timer.h</filename>) to handle this case. It returns the number of times the timer had to be deleted before we finally stopped it from adding itself back in. </para> </sect1> </chapter> <chapter id="references"> <title>Further reading</title> <itemizedlist> <listitem> <para> <filename>Documentation/spinlocks.txt</filename>: Linus Torvalds' spinlocking tutorial in the kernel sources. </para> </listitem> <listitem> <para> Unix Systems for Modern Architectures: Symmetric Multiprocessing and Caching for Kernel Programmers: </para> <para> Curt Schimmel's very good introduction to kernel level locking (not written for Linux, but nearly everything applies). The book is expensive, but really worth every penny to understand SMP locking. [ISBN: 0201633388] </para> </listitem> </itemizedlist> </chapter> <chapter id="thanks"> <title>Thanks</title> <para> Thanks to Telsa Gwynne for DocBooking, neatening and adding style. </para> <para> Thanks to Martin Pool, Philipp Rumpf, Stephen Rothwell, Paul Mackerras, Ruedi Aschwanden, Alan Cox, Manfred Spraul and Tim Waugh for proofreading, correcting, flaming, commenting. </para> <para> Thanks to the cabal for having no influence on this document. </para> </chapter> <glossary id="glossary"> <title>Glossary</title> <glossentry id="gloss-bh"> <glossterm>bh</glossterm> <glossdef> <para> Bottom Half: for historical reasons, functions with `_bh' in them often now refer to any software interrupt, e.g. <function>spin_lock_bh()</function> blocks any software interrupt on the current CPU. Bottom halves are deprecated, and will eventually be replaced by tasklets. Only one bottom half will be running at any time. </para> </glossdef> </glossentry> <glossentry id="gloss-hwinterrupt"> <glossterm>Hardware Interrupt / Hardware IRQ</glossterm> <glossdef> <para> Hardware interrupt request. <function>in_irq()</function> returns <returnvalue>true</returnvalue> in a hardware interrupt handler (it also returns true when interrupts are blocked). </para> </glossdef> </glossentry> <glossentry id="gloss-interruptcontext"> <glossterm>Interrupt Context</glossterm> <glossdef> <para> Not user context: processing a hardware irq or software irq. Indicated by the <function>in_interrupt()</function> macro returning <returnvalue>true</returnvalue> (although it also returns true when interrupts or BHs are blocked). </para> </glossdef> </glossentry> <glossentry id="gloss-smp"> <glossterm><acronym>SMP</acronym></glossterm> <glossdef> <para> Symmetric Multi-Processor: kernels compiled for multiple-CPU machines. (CONFIG_SMP=y). </para> </glossdef> </glossentry> <glossentry id="gloss-softirq"> <glossterm>softirq</glossterm> <glossdef> <para> Strictly speaking, one of up to 32 enumerated software interrupts which can run on multiple CPUs at once. Sometimes used to refer to tasklets and bottom halves as well (ie. all software interrupts). </para> </glossdef> </glossentry> <glossentry id="gloss-swinterrupt"> <glossterm>Software Interrupt / Software IRQ</glossterm> <glossdef> <para> Software interrupt handler. <function>in_irq()</function> returns <returnvalue>false</returnvalue>; <function>in_softirq()</function> returns <returnvalue>true</returnvalue>. Tasklets, softirqs and bottom halves all fall into the category of `software interrupts'. </para> </glossdef> </glossentry> <glossentry id="gloss-tasklet"> <glossterm>tasklet</glossterm> <glossdef> <para> A dynamically-registrable software interrupt, which is guaranteed to only run on one CPU at a time. </para> </glossdef> </glossentry> <glossentry id="gloss-up"> <glossterm><acronym>UP</acronym></glossterm> <glossdef> <para> Uni-Processor: Non-SMP. (CONFIG_SMP=n). </para> </glossdef> </glossentry> <glossentry id="gloss-usercontext"> <glossterm>User Context</glossterm> <glossdef> <para> The kernel executing on behalf of a particular process or kernel thread (given by the <function>current()</function> macro.) Not to be confused with userspace. Can be interrupted by software or hardware interrupts. </para> </glossdef> </glossentry> <glossentry id="gloss-userspace"> <glossterm>Userspace</glossterm> <glossdef> <para> A process executing its own code outside the kernel. </para> </glossdef> </glossentry> </glossary></book>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -