<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel, branch v2.6.25.12</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/kernel?h=v2.6.25.12</id>
<link rel='self' href='https://git.amat.us/linux/atom/kernel?h=v2.6.25.12'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2008-07-24T16:14:12Z</updated>
<entry>
<title>hrtimer: prevent migration for raising softirq</title>
<updated>2008-07-24T16:14:12Z</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2008-07-03T18:31:26Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=7605f791515eea189e956db44b5a404bd93b29dc'/>
<id>urn:sha1:7605f791515eea189e956db44b5a404bd93b29dc</id>
<content type='text'>
commit ee3ece830f6db9837f7ac67008f532a8c1e755f4 upstream.

Due to a possible deadlock, the waking of the softirq was pushed outside
of the hrtimer base locks. See commit 0c96c5979a522c3323c30a078a70120e29b5bdbc

Unfortunately this allows the task to migrate after setting up the softirq
and raising it. Since softirqs run a queue that is per-cpu we may raise the
softirq on the wrong CPU and this will keep the queued softirq task from
running.

To solve this issue, this patch disables preemption around the releasing
of the hrtimer lock and raising of the softirq.

Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>sched: fix cpu hotplug</title>
<updated>2008-07-03T03:46:15Z</updated>
<author>
<name>Dmitry Adamushko</name>
<email>dmitry.adamushko@gmail.com</email>
</author>
<published>2008-06-30T16:22:34Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0758c2f30b75419fbe5e0ec6dfc892bbc0687f57'/>
<id>urn:sha1:0758c2f30b75419fbe5e0ec6dfc892bbc0687f57</id>
<content type='text'>
Commit 79c537998d143b127c8c662a403c3356cb885f1c upstream

the CPU hotplug problems (crashes under high-volume unplug+replug
tests) seem to be related to migrate_dead_tasks().

Firstly I added traces to see all tasks being migrated with
migrate_live_tasks() and migrate_dead_tasks(). On my setup the problem
pops up (the one with "se == NULL" in the loop of
pick_next_task_fair()) shortly after the traces indicate that some has
been migrated with migrate_dead_tasks()). btw., I can reproduce it
much faster now with just a plain cpu down/up loop.

[disclaimer] Well, unless I'm really missing something important in
this late hour [/desclaimer] pick_next_task() is not something
appropriate for migrate_dead_tasks() :-)

the following change seems to eliminate the problem on my setup
(although, I kept it running only for a few minutes to get a few
messages indicating migrate_dead_tasks() does move tasks and the
system is still ok)

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>futexes: fix fault handling in futex_lock_pi</title>
<updated>2008-07-03T03:46:14Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2008-06-23T23:30:13Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=935cfe235e825998eef8fd0673b1bd62fe4a85c2'/>
<id>urn:sha1:935cfe235e825998eef8fd0673b1bd62fe4a85c2</id>
<content type='text'>
commit 1b7558e457ed0de61023cfc913d2c342c7c3d9f2 upstream

This patch addresses a very sporadic pi-futex related failure in
highly threaded java apps on large SMP systems.

David Holmes reported that the pi_state consistency check in
lookup_pi_state triggered with his test application. This means that
the kernel internal pi_state and the user space futex variable are out
of sync. First we assumed that this is a user space data corruption,
but deeper investigation revieled that the problem happend because the
pi-futex code is not handling a fault in the futex_lock_pi path when
the user space variable needs to be fixed up.

The fault happens when a fork mapped the anon memory which contains
the futex readonly for COW or the page got swapped out exactly between
the unlock of the futex and the return of either the new futex owner
or the task which was the expected owner but failed to acquire the
kernel internal rtmutex. The current futex_lock_pi() code drops out
with an inconsistent in case it faults and returns -EFAULT to user
space. User space has no way to fixup that state.

When we wrote this code we thought that we could not drop the hash
bucket lock at this point to handle the fault.

After analysing the code again it turned out to be wrong because there
are only two tasks involved which might modify the pi_state and the
user space variable:

 - the task which acquired the rtmutex
 - the pending owner of the pi_state which did not get the rtmutex

Both tasks drop into the fixup_pi_state() function before returning to
user space. The first task which acquired the hash bucket lock faults
in the fixup of the user space variable, drops the spinlock and calls
futex_handle_fault() to fault in the page. Now the second task could
acquire the hash bucket lock and tries to fixup the user space
variable as well. It either faults as well or it succeeds because the
first task already faulted the page in.

One caveat is to avoid a double fixup. After returning from the fault
handling we reacquire the hash bucket lock and check whether the
pi_state owner has been modified already.

Reported-by: David Holmes &lt;david.holmes@sun.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: David Holmes &lt;david.holmes@sun.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>capabilities: remain source compatible with 32-bit raw legacy capability support.</title>
<updated>2008-06-09T18:27:06Z</updated>
<author>
<name>Andrew G. Morgan</name>
<email>morgan@kernel.org</email>
</author>
<published>2008-06-06T18:44:08Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=52491a6bd397c868a0e8b0282548e8527f26e705'/>
<id>urn:sha1:52491a6bd397c868a0e8b0282548e8527f26e705</id>
<content type='text'>
upstream commit: ca05a99a54db1db5bca72eccb5866d2a86f8517f

Source code out there hard-codes a notion of what the
_LINUX_CAPABILITY_VERSION #define means in terms of the semantics of the
raw capability system calls capget() and capset().  Its unfortunate, but
true.

Since the confusing header file has been in a released kernel, there is
software that is erroneously using 64-bit capabilities with the semantics
of 32-bit compatibilities.  These recently compiled programs may suffer
corruption of their memory when sys_getcap() overwrites more memory than
they are coded to expect, and the raising of added capabilities when using
sys_capset().

As such, this patch does a number of things to clean up the situation
for all. It

  1. forces the _LINUX_CAPABILITY_VERSION define to always retain its
     legacy value.

  2. adopts a new #define strategy for the kernel's internal
     implementation of the preferred magic.

  3. deprecates v2 capability magic in favor of a new (v3) magic
     number. The functionality of v3 is entirely equivalent to v2,
     the only difference being that the v2 magic causes the kernel
     to log a "deprecated" warning so the admin can find applications
     that may be using v2 inappropriately.

[User space code continues to be encouraged to use the libcap API which
protects the application from details like this.  libcap-2.10 is the first
to support v3 capabilities.]

Fixes issue reported in https://bugzilla.redhat.com/show_bug.cgi?id=447518.
Thanks to Bojan Smojver for the report.

[akpm@linux-foundation.org: s/depreciate/deprecate/g]
[akpm@linux-foundation.org: be robust about put_user size]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Andrew G. Morgan &lt;morgan@kernel.org&gt;
Cc: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Cc: Bojan Smojver &lt;bojan@rexursive.com&gt;
Cc: stable@kernel.org
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
</entry>
<entry>
<title>cgroups: remove node_ prefix_from ns subsystem</title>
<updated>2008-06-09T18:27:01Z</updated>
<author>
<name>Cedric Le Goater</name>
<email>clg@fr.ibm.com</email>
</author>
<published>2008-05-24T17:40:10Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d7f8570c79107646445ffe2eba5e556c2c89c4d8'/>
<id>urn:sha1:d7f8570c79107646445ffe2eba5e556c2c89c4d8</id>
<content type='text'>
upstream commit: 5c02b575780d0d785815a1e7b79a98edddee895a

This is a slight change in the namespace cgroup subsystem api.

The change is that previously when cgroup_clone() was called (currently
only from the unshare path in ns_proxy cgroup, you'd get a new group named
"node_$pid" whereas now you'll get a group named after just your pid.)

The only users who would notice it are those who are using the ns_proxy
cgroup subsystem to auto-create cgroups when namespaces are unshared -
something of an experimental feature, which I think really needs more
complete container/namespace support in order to be useful.  I suspect the
only users are Cedric and Serge, or maybe a few others on
containers@lists.linux-foundation.org.  And in fact it would only be
noticed by the users who make the assumption about how the name is
generated, rather than getting it from the /proc/&lt;pid&gt;/cgroups file for
the process in question.

Whether the change is actually needed or not I'm fairly agnostic on, but I
guess it is more elegant to just use the pid as the new group name rather
than adding a fairly arbitrary "node_" prefix on the front.

[menage@google.com: provided changelog]
Signed-off-by: Cedric Le Goater &lt;clg@fr.ibm.com&gt;
Cc: "Paul Menage" &lt;menage@google.com&gt;
Cc: "Serge E. Hallyn" &lt;serue@us.ibm.com&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
</entry>
<entry>
<title>sched: fix hrtick_start_fair and CPU-Hotplug</title>
<updated>2008-05-10T04:40:44Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2008-05-06T03:05:15Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=980e8ec0c4ab54164725f1a71545c439a755e918'/>
<id>urn:sha1:980e8ec0c4ab54164725f1a71545c439a755e918</id>
<content type='text'>
commit: b328ca182f01c2a04b85e0ee8a410720b104fbcc upstream

Gautham R Shenoy reported:

 &gt; While running the usual CPU-Hotplug stress tests on linux-2.6.25,
 &gt; I noticed the following in the console logs.
 &gt;
 &gt; This is a wee bit difficult to reproduce. In the past 10 runs I hit this
 &gt; only once.
 &gt;
 &gt; ------------[ cut here ]------------
 &gt;
 &gt; WARNING: at kernel/sched.c:962 hrtick+0x2e/0x65()
 &gt;
 &gt; Just wondering if we are doing a good job at handling the cancellation
 &gt; of any per-cpu scheduler timers during CPU-Hotplug.

This looks like its indeed not cancelled at all and migrates the it to
another cpu. Fix it via a proper hotplug notifier mechanism.

Reported-by: Gautham R Shenoy &lt;ego@in.ibm.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>hrtimer: raise softirq unlocked to avoid circular lock dependency</title>
<updated>2008-05-01T21:44:39Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2008-04-29T01:15:10Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f5f5e084959d9c22c43c235b206b2e2fe2971e7f'/>
<id>urn:sha1:f5f5e084959d9c22c43c235b206b2e2fe2971e7f</id>
<content type='text'>
commit 0c96c5979a522c3323c30a078a70120e29b5bdbc upstream

The scheduler hrtimer bits in 2.6.25 introduced a circular lock
dependency in a rare code path:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.25-sched-devel.git-x86-latest.git #19
-------------------------------------------------------
X/2980 is trying to acquire lock:
 (&amp;rq-&gt;rq_lock_key#2){++..}, at: [&lt;ffffffff80230146&gt;] task_rq_lock+0x56/0xa0

but task is already holding lock:
 (&amp;cpu_base-&gt;lock){++..}, at: [&lt;ffffffff80257ae1&gt;] lock_hrtimer_base+0x31/0x60

which lock already depends on the new lock.

The scenario which leads to this is:

posix-timer signal is delivered
 -&gt; posix-timer is rearmed
    timer is already expired in hrtimer_enqueue()
     -&gt; softirq is raised

To prevent this we need to move the raise of the softirq out of the
base-&gt;lock protected code path.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>hrtimer: timeout too long when using HRTIMER_CB_SOFTIRQ</title>
<updated>2008-05-01T21:44:38Z</updated>
<author>
<name>Bodo Stroesser</name>
<email>bstroesser@fujitsu-siemens.com</email>
</author>
<published>2008-04-28T17:15:50Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=59c5775ada913643998fd78d8a5b1a76ba57515f'/>
<id>urn:sha1:59c5775ada913643998fd78d8a5b1a76ba57515f</id>
<content type='text'>
commit d7b41a24bfb5d7fa02f7b49be1293d468814e424 upstream

When using hrtimer with timer-&gt;cb_mode == HRTIMER_CB_SOFTIRQ
in some cases the clockevent is not programmed.
This happens, if:
 - a timer is rearmed while it's state is HRTIMER_STATE_CALLBACK
 - hrtimer_reprogram() returns -ETIME, when it is called after
   CALLBACK is finished. This occurs if the new timer-&gt;expires
   is in the past when CALLBACK is done.
In this case, the timer needs to be removed from the tree and put
onto the pending list again.

The patch is against 2.6.22.5, but AFAICS, it is relevant
for 2.6.25 also (in run_hrtimer_pending()).

Signed-off-by: Bodo Stroesser &lt;bstroesser@fujitsu-siemens.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>cgroup: fix a race condition in manipulating tsk-&gt;cg_list</title>
<updated>2008-05-01T21:44:33Z</updated>
<author>
<name>Li Zefan</name>
<email>lizf@cn.fujitsu.com</email>
</author>
<published>2008-04-18T16:25:10Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=bc657c218dc4f0d8fbb5fb9c746c0dd9736e128a'/>
<id>urn:sha1:bc657c218dc4f0d8fbb5fb9c746c0dd9736e128a</id>
<content type='text'>
commit: 0e04388f0189fa1f6812a8e1cb6172136eada87e

When I ran a test program to fork mass processes and at the same time
'cat /cgroup/tasks', I got the following oops:

  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:72!
  invalid opcode: 0000 [#1] SMP
  Pid: 4178, comm: a.out Not tainted (2.6.25-rc9 #72)
  ...
  Call Trace:
   [&lt;c044a5f9&gt;] ? cgroup_exit+0x55/0x94
   [&lt;c0427acf&gt;] ? do_exit+0x217/0x5ba
   [&lt;c0427ed7&gt;] ? do_group_exit+0.65/0x7c
   [&lt;c0427efd&gt;] ? sys_exit_group+0xf/0x11
   [&lt;c0404842&gt;] ? syscall_call+0x7/0xb
   [&lt;c05e0000&gt;] ? init_cyrix+0x2fa/0x479
  ...
  EIP: [&lt;c04df671&gt;] list_del+0x35/0x53 SS:ESP 0068:ebc7df4
  ---[ end trace caffb7332252612b ]---
  Fixing recursive fault but reboot is needed!

After digging into the code and debugging, I finlly found out a race
situation:

				do_exit()
				  -&gt;cgroup_exit()
				    -&gt;if (!list_empty(&amp;tsk-&gt;cg_list))
				        list_del(&amp;tsk-&gt;cg_list);

  cgroup_iter_start()
    -&gt;cgroup_enable_task_cg_list()
      -&gt;list_add(&amp;tsk-&gt;cg_list, ..);

In this case the list won't be deleted though the process has exited.

We got two bug reports in the past, which seem to be the same bug as
this one:
	http://lkml.org/lkml/2008/3/5/332
	http://lkml.org/lkml/2007/10/17/224

Actually sometimes I got oops on list_del, sometimes oops on list_add.
And I can change my test program a bit to trigger other oops.

The patch has been tested both on x86_32 and x86_64.

Signed-off-by: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Acked-by: Paul Menage &lt;menage@google.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>Fix locking bug in "acquire_console_semaphore_for_printk()"</title>
<updated>2008-04-15T20:09:54Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2008-04-15T20:09:54Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=093a07e2fdfaddab7fc7d4adc76cc569c86603d7'/>
<id>urn:sha1:093a07e2fdfaddab7fc7d4adc76cc569c86603d7</id>
<content type='text'>
When I cleaned up printk() and split up the printk locking logic in
commit 266c2e0abeca649fa6667a1a427ad1da507c6375 ("Make printk() console
semaphore accesses sensible") I had incorrectly moved the call to
have_callable_console() outside of the console semaphore.

That was buggy.  The console semaphore protects the console_drivers list
that is used by have_callable_console().

Thanks go to Bongani Hlope who saw this as a hang on shutdown and reboot
and bisected the bug to the right commit, and tested this patch. See

	http://lkml.org/lkml/2008/4/11/315

Bisected-and-tested-by: Bongani Hlope &lt;bonganilinux@mweb.co.za&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
