<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
<book id="lk-hacking-guide">
<bookinfo>
<title>Unreliable Guide To Hacking The Linux Kernel</title>
<authorgroup>
<author>
<firstname>Rusty</firstname>
<surname>Russell</surname>
<affiliation>
<address>
<email>rusty@rustcorp.com.au</email>
</address>
</affiliation>
</author>
</authorgroup>
<copyright>
<year>2005</year>
<holder>Rusty Russell</holder>
</copyright>
<legalnotice>
<para>
This documentation is free software; you can redistribute
it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation; either
version 2 of the License, or (at your option) any later
version.
</para>
<para>
This program is distributed in the hope that it will be
useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the GNU General Public License for more details.
</para>
<para>
You should have received a copy of the GNU General Public
License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
MA 02111-1307 USA
</para>
<para>
For more details see the file COPYING in the source
distribution of Linux.
</para>
</legalnotice>
<releaseinfo>
This is the first release of this document as part of the kernel tarball.
</releaseinfo>
</bookinfo>
<toc></toc>
<chapter id="introduction">
<title>Introduction</title>
<para>
Welcome, gentle reader, to Rusty's Remarkably Unreliable Guide to Linux
Kernel Hacking. This document describes the common routines and
general requirements for kernel code: its goal is to serve as a
primer for Linux kernel development for experienced C
programmers. I avoid implementation details: that's what the
code is for, and I ignore whole tracts of useful routines.
</para>
<para>
Before you read this, please understand that I never wanted to
write this document, being grossly under-qualified, but I always
wanted to read it, and this was the only way. I hope it will
grow into a compendium of best practice, common starting points
and random information.
</para>
</chapter>
<chapter id="basic-players">
<title>The Players</title>
<para>
At any time each of the CPUs in a system can be:
</para>
<itemizedlist>
<listitem>
<para>
not associated with any process, serving a hardware interrupt;
</para>
</listitem>
<listitem>
<para>
not associated with any process, serving a softirq or tasklet;
</para>
</listitem>
<listitem>
<para>
running in kernel space, associated with a process (user context);
</para>
</listitem>
<listitem>
<para>
running a process in user space.
</para>
</listitem>
</itemizedlist>
<para>
There is an ordering between these. The bottom two can preempt
each other, but above that is a strict hierarchy: each can only be
preempted by the ones above it. For example, while a softirq is
running on a CPU, no other softirq will preempt it, but a hardware
interrupt can. However, any other CPUs in the system execute
independently.
</para>
<para>
We'll see a number of ways that the user context can block
interrupts, to become truly non-preemptable.
</para>
<sect1 id="basics-usercontext">
<title>User Context</title>
<para>
User context is when you are coming in from a system call or other
trap: like userspace, you can be preempted by more important tasks
and by interrupts. You can sleep, by calling
<function>schedule()</function>.
</para>
<note>
<para>
You are always in user context on module load and unload,
and on operations on the block device layer.
</para>
</note>
<para>
In user context, the <varname>current</varname> pointer (indicating
the task we are currently executing) is valid, and
<function>in_interrupt()</function>
(<filename>include/linux/interrupt.h</filename>) is <returnvalue>false
</returnvalue>.
</para>
<caution>
<para>
Beware that if you have preemption or softirqs disabled
(see below), <function>in_interrupt()</function> will return a
false positive.
</para>
</caution>
</sect1>
<sect1 id="basics-hardirqs">
<title>Hardware Interrupts (Hard IRQs)</title>
<para>
Timer ticks, <hardware>network cards</hardware> and
<hardware>keyboard</hardware> are examples of real
hardware which produce interrupts at any time. The kernel runs
interrupt handlers, which services the hardware. The kernel
guarantees that this handler is never re-entered: if the same
interrupt arrives, it is queued (or dropped). Because it
disables interrupts, this handler has to be fast: frequently it
simply acknowledges the interrupt, marks a 'software interrupt'
for execution and exits.
</para>
<para>
You can tell you are in a hardware interrupt, because
<function>in_irq()</function> returns <returnvalue>true</returnvalue>.
</para>
<caution>
<para>
Beware that this will return a false positive if interrupts are disabled
(see below).
</para>
</caution>
</sect1>
<sect1 id="basics-softirqs">
<title>Software Interrupt Context: Softirqs and Tasklets</title>
<para>
Whenever a system call is about to return to userspace, or a
hardware interrupt handler exits, any 'software interrupts'
which are marked pending (usually by hardware interrupts) are
run (<filename>kernel/softirq.c</filename>).
</para>
<para>
Much of the real interrupt handling work is done here. Early in<