<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel, branch v3.4.24</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/kernel?h=v3.4.24</id>
<link rel='self' href='https://git.amat.us/linux/atom/kernel?h=v3.4.24'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2012-12-17T18:37:46Z</updated>
<entry>
<title>rcu: Fix batch-limit size problem</title>
<updated>2012-12-17T18:37:46Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-10-18T11:55:36Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=29251de53e1712158c29b3626a008d8b93a6e024'/>
<id>urn:sha1:29251de53e1712158c29b3626a008d8b93a6e024</id>
<content type='text'>
commit 878d7439d0f45a95869e417576774673d1fa243f upstream.

Commit 29c00b4a1d9e27 (rcu: Add event-tracing for RCU callback
invocation) added a regression in rcu_do_batch()

Under stress, RCU is supposed to allow to process all items in queue,
instead of a batch of 10 items (blimit), but an integer overflow makes
the effective limit being 1.  So, unless there is frequent idle periods
(during which RCU ignores batch limits), RCU can be forced into a
state where it cannot keep up with the callback-generation rate,
eventually resulting in OOM.

This commit therefore converts a few variables in rcu_do_batch() from
int to long to fix this problem, along with the module parameters
controlling the batch limits.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>ftrace: Clear bits properly in reset_iter_read()</title>
<updated>2012-12-17T18:37:46Z</updated>
<author>
<name>Dan Carpenter</name>
<email>dan.carpenter@oracle.com</email>
</author>
<published>2012-06-09T16:10:27Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=455d8953951baf0d38673b010100241e508b765c'/>
<id>urn:sha1:455d8953951baf0d38673b010100241e508b765c</id>
<content type='text'>
commit 70f77b3f7ec010ff9624c1f2e39a81babc9e2429 upstream.

There is a typo here where '&amp;' is used instead of '|' and it turns the
statement into a noop.  The original code is equivalent to:

	iter-&gt;flags &amp;= ~((1 &lt;&lt; 2) &amp; (1 &lt;&lt; 4));

Link: http://lkml.kernel.org/r/20120609161027.GD6488@elgon.mountain

Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>workqueue: convert BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s</title>
<updated>2012-12-17T18:37:43Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-12-04T15:40:39Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=4afca924c03e95345be979bfb7abbfa85963fada'/>
<id>urn:sha1:4afca924c03e95345be979bfb7abbfa85963fada</id>
<content type='text'>
commit fc4b514f2727f74a4587c31db87e0e93465518c3 upstream.

8852aac25e ("workqueue: mod_delayed_work_on() shouldn't queue timer on
0 delay") unexpectedly uncovered a very nasty abuse of delayed_work in
megaraid - it allocated work_struct, casted it to delayed_work and
then pass that into queue_delayed_work().

Previously, this was okay because 0 @delay short-circuited to
queue_work() before doing anything with delayed_work.  8852aac25e
moved 0 @delay test into __queue_delayed_work() after sanity check on
delayed_work making megaraid trigger BUG_ON().

Although megaraid is already fixed by c1d390d8e6 ("megaraid: fix
BUG_ON() from incorrect use of delayed work"), this patch converts
BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s so that such
abusers, if there are more, trigger warning but don't crash the
machine.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Xiaotian Feng &lt;xtfeng@gmail.com&gt;
Signed-off-by: Shuah Khan &lt;shuah.khan@hp.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>Revert "sched, autogroup: Stop going ahead if autogroup is disabled"</title>
<updated>2012-12-10T18:59:40Z</updated>
<author>
<name>Mike Galbraith</name>
<email>efault@gmx.de</email>
</author>
<published>2012-12-03T05:25:25Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=3d81da8645585b9c116a7ac20b99ac8909677c80'/>
<id>urn:sha1:3d81da8645585b9c116a7ac20b99ac8909677c80</id>
<content type='text'>
commit fd8ef11730f1d03d5d6555aa53126e9e34f52f12 upstream.

This reverts commit 800d4d30c8f20bd728e5741a3b77c4859a613f7c.

Between commits 8323f26ce342 ("sched: Fix race in task_group()") and
800d4d30c8f2 ("sched, autogroup: Stop going ahead if autogroup is
disabled"), autogroup is a wreck.

With both applied, all you have to do to crash a box is disable
autogroup during boot up, then reboot..  boom, NULL pointer dereference
due to commit 800d4d30c8f2 not allowing autogroup to move things, and
commit 8323f26ce342 making that the only way to switch runqueues:

  BUG: unable to handle kernel NULL pointer dereference at           (null)
  IP: [&lt;ffffffff81063ac0&gt;] effective_load.isra.43+0x50/0x90
  Pid: 7047, comm: systemd-user-se Not tainted 3.6.8-smp #7 MEDIONPC MS-7502/MS-7502
  RIP: effective_load.isra.43+0x50/0x90
  Process systemd-user-se (pid: 7047, threadinfo ffff880221dde000, task ffff88022618b3a0)
  Call Trace:
    select_task_rq_fair+0x255/0x780
    try_to_wake_up+0x156/0x2c0
    wake_up_state+0xb/0x10
    signal_wake_up+0x28/0x40
    complete_signal+0x1d6/0x250
    __send_signal+0x170/0x310
    send_signal+0x40/0x80
    do_send_sig_info+0x47/0x90
    group_send_sig_info+0x4a/0x70
    kill_pid_info+0x3a/0x60
    sys_kill+0x97/0x1a0
    ? vfs_read+0x120/0x160
    ? sys_read+0x45/0x90
    system_call_fastpath+0x16/0x1b
  Code: 49 0f af 41 50 31 d2 49 f7 f0 48 83 f8 01 48 0f 46 c6 48 2b 07 48 8b bf 40 01 00 00 48 85 ff 74 3a 45 31 c0 48 8b 8f 50 01 00 00 &lt;48&gt; 8b 11 4c 8b 89 80 00 00 00 49 89 d2 48 01 d0 45 8b 59 58 4c
  RIP  [&lt;ffffffff81063ac0&gt;] effective_load.isra.43+0x50/0x90
   RSP &lt;ffff880221ddfbd8&gt;
  CR2: 0000000000000000

Signed-off-by: Mike Galbraith &lt;efault@gmx.de&gt;
Acked-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Yong Zhang &lt;yong.zhang0@gmail.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>workqueue: exit rescuer_thread() as TASK_RUNNING</title>
<updated>2012-12-10T18:59:39Z</updated>
<author>
<name>Mike Galbraith</name>
<email>mgalbraith@suse.de</email>
</author>
<published>2012-11-28T06:17:18Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=dbdd7f0c98e8ee0d49da5e8b462ad2ba07d0f358'/>
<id>urn:sha1:dbdd7f0c98e8ee0d49da5e8b462ad2ba07d0f358</id>
<content type='text'>
commit 412d32e6c98527078779e5b515823b2810e40324 upstream.

A rescue thread exiting TASK_INTERRUPTIBLE can lead to a task scheduling
off, never to be seen again.  In the case where this occurred, an exiting
thread hit reiserfs homebrew conditional resched while holding a mutex,
bringing the box to its knees.

PID: 18105  TASK: ffff8807fd412180  CPU: 5   COMMAND: "kdmflush"
 #0 [ffff8808157e7670] schedule at ffffffff8143f489
 #1 [ffff8808157e77b8] reiserfs_get_block at ffffffffa038ab2d [reiserfs]
 #2 [ffff8808157e79a8] __block_write_begin at ffffffff8117fb14
 #3 [ffff8808157e7a98] reiserfs_write_begin at ffffffffa0388695 [reiserfs]
 #4 [ffff8808157e7ad8] generic_perform_write at ffffffff810ee9e2
 #5 [ffff8808157e7b58] generic_file_buffered_write at ffffffff810eeb41
 #6 [ffff8808157e7ba8] __generic_file_aio_write at ffffffff810f1a3a
 #7 [ffff8808157e7c58] generic_file_aio_write at ffffffff810f1c88
 #8 [ffff8808157e7cc8] do_sync_write at ffffffff8114f850
 #9 [ffff8808157e7dd8] do_acct_process at ffffffff810a268f
    [exception RIP: kernel_thread_helper]
    RIP: ffffffff8144a5c0  RSP: ffff8808157e7f58  RFLAGS: 00000202
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: ffffffff8107af60  RDI: ffff8803ee491d18
    RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Signed-off-by: Mike Galbraith &lt;mgalbraith@suse.de&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>timekeeping: Cast raw_interval to u64 to avoid shift overflow</title>
<updated>2012-12-03T19:47:23Z</updated>
<author>
<name>Dan Carpenter</name>
<email>dan.carpenter@oracle.com</email>
</author>
<published>2012-10-09T07:18:23Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=1103ef8e70a1725b7fd9e0cb18a8b1edb6e5c42d'/>
<id>urn:sha1:1103ef8e70a1725b7fd9e0cb18a8b1edb6e5c42d</id>
<content type='text'>
commit 5b3900cd409466c0070b234d941650685ad0c791 upstream.

We fixed a bunch of integer overflows in timekeeping code during the 3.6
cycle.  I did an audit based on that and found this potential overflow.

Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Acked-by: John Stultz &lt;johnstul@us.ibm.com&gt;
Link: http://lkml.kernel.org/r/20121009071823.GA19159@elgon.mountain
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ben Hutchings &lt;ben@decadent.org.uk&gt;
[ herton: adapt for 3.5, timekeeper instead of tk pointer ]
Signed-off-by: Herton Ronaldo Krzesinski &lt;herton.krzesinski@canonical.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>watchdog: using u64 in get_sample_period()</title>
<updated>2012-12-03T19:47:17Z</updated>
<author>
<name>Chuansheng Liu</name>
<email>chuansheng.liu@intel.com</email>
</author>
<published>2012-11-27T00:29:54Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=53b360c604575070c83d2da83636af4e5c5de35a'/>
<id>urn:sha1:53b360c604575070c83d2da83636af4e5c5de35a</id>
<content type='text'>
commit 8ffeb9b0e6369135bf03a073514f571ef10606b9 upstream.

In get_sample_period(), unsigned long is not enough:

  watchdog_thresh * 2 * (NSEC_PER_SEC / 5)

case1:
  watchdog_thresh is 10 by default, the sample value will be: 0xEE6B2800

case2:
 set watchdog_thresh is 20, the sample value will be: 0x1 DCD6 5000

In case2, we need use u64 to express the sample period.  Otherwise,
changing the threshold thru proc often can not be successful.

Signed-off-by: liu chuansheng &lt;chuansheng.liu@intel.com&gt;
Acked-by: Don Zickus &lt;dzickus@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Shuah Khan &lt;shuah.khan@hp.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>futex: avoid wake_futex() for a PI futex_q</title>
<updated>2012-12-03T19:47:07Z</updated>
<author>
<name>Darren Hart</name>
<email>dvhart@linux.intel.com</email>
</author>
<published>2012-11-27T00:29:56Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=fceca5e72e787dd0a8ea29e22a874e363389356c'/>
<id>urn:sha1:fceca5e72e787dd0a8ea29e22a874e363389356c</id>
<content type='text'>
commit aa10990e028cac3d5e255711fb9fb47e00700e35 upstream.

Dave Jones reported a bug with futex_lock_pi() that his trinity test
exposed.  Sometime between queue_me() and taking the q.lock_ptr, the
lock_ptr became NULL, resulting in a crash.

While futex_wake() is careful to not call wake_futex() on futex_q's with
a pi_state or an rt_waiter (which are either waiting for a
futex_unlock_pi() or a PI futex_requeue()), futex_wake_op() and
futex_requeue() do not perform the same test.

Update futex_wake_op() and futex_requeue() to test for q.pi_state and
q.rt_waiter and abort with -EINVAL if detected.  To ensure any future
breakage is caught, add a WARN() to wake_futex() if the same condition
is true.

This fix has seen 3 hours of testing with "trinity -c futex" on an
x86_64 VM with 4 CPUS.

[akpm@linux-foundation.org: tidy up the WARN()]
Signed-off-by: Darren Hart &lt;dvhart@linux.intel.com&gt;
Reported-by: Dave Jones &lt;davej@redat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: John Kacur &lt;jkacur@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>module: fix out-by-one error in kallsyms</title>
<updated>2012-11-26T19:37:41Z</updated>
<author>
<name>Rusty Russell</name>
<email>rusty@rustcorp.com.au</email>
</author>
<published>2012-10-25T00:19:25Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0de578d014506a662b239708f4a0ded7e083292c'/>
<id>urn:sha1:0de578d014506a662b239708f4a0ded7e083292c</id>
<content type='text'>
commit 59ef28b1f14899b10d6b2682c7057ca00a9a3f47 upstream.

Masaki found and patched a kallsyms issue: the last symbol in a
module's symtab wasn't transferred.  This is because we manually copy
the zero'th entry (which is always empty) then copy the rest in a loop
starting at 1, though from src[0].  His fix was minimal, I prefer to
rewrite the loops in more standard form.

There are two loops: one to get the size, and one to copy.  Make these
identical: always count entry 0 and any defined symbol in an allocated
non-init section.

This bug exists since the following commit was introduced.
   module: reduce symbol table for loaded modules (v2)
   commit: 4a4962263f07d14660849ec134ee42b63e95ea9a

LKML: http://lkml.org/lkml/2012/10/24/27
Reported-by: Masaki Kimura &lt;masaki.kimura.kz@hitachi.com&gt;
Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>futex: Handle futex_pi OWNER_DIED take over correctly</title>
<updated>2012-11-17T21:16:22Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2012-10-23T20:29:38Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=c4cbedfda2227df82126c9dd5e7593565bf45d21'/>
<id>urn:sha1:c4cbedfda2227df82126c9dd5e7593565bf45d21</id>
<content type='text'>
commit 59fa6245192159ab5e1e17b8e31f15afa9cff4bf upstream.

Siddhesh analyzed a failure in the take over of pi futexes in case the
owner died and provided a workaround.
See: http://sourceware.org/bugzilla/show_bug.cgi?id=14076

The detailed problem analysis shows:

Futex F is initialized with PTHREAD_PRIO_INHERIT and
PTHREAD_MUTEX_ROBUST_NP attributes.

T1 lock_futex_pi(F);

T2 lock_futex_pi(F);
   --&gt; T2 blocks on the futex and creates pi_state which is associated
       to T1.

T1 exits
   --&gt; exit_robust_list() runs
       --&gt; Futex F userspace value TID field is set to 0 and
           FUTEX_OWNER_DIED bit is set.

T3 lock_futex_pi(F);
   --&gt; Succeeds due to the check for F's userspace TID field == 0
   --&gt; Claims ownership of the futex and sets its own TID into the
       userspace TID field of futex F
   --&gt; returns to user space

T1 --&gt; exit_pi_state_list()
       --&gt; Transfers pi_state to waiter T2 and wakes T2 via
       	   rt_mutex_unlock(&amp;pi_state-&gt;mutex)

T2 --&gt; acquires pi_state-&gt;mutex and gains real ownership of the
       pi_state
   --&gt; Claims ownership of the futex and sets its own TID into the
       userspace TID field of futex F
   --&gt; returns to user space

T3 --&gt; observes inconsistent state

This problem is independent of UP/SMP, preemptible/non preemptible
kernels, or process shared vs. private. The only difference is that
certain configurations are more likely to expose it.

So as Siddhesh correctly analyzed the following check in
futex_lock_pi_atomic() is the culprit:

	if (unlikely(ownerdied || !(curval &amp; FUTEX_TID_MASK))) {

We check the userspace value for a TID value of 0 and take over the
futex unconditionally if that's true.

AFAICT this check is there as it is correct for a different corner
case of futexes: the WAITERS bit became stale.

Now the proposed change

-	if (unlikely(ownerdied || !(curval &amp; FUTEX_TID_MASK))) {
+       if (unlikely(ownerdied ||
+                       !(curval &amp; (FUTEX_TID_MASK | FUTEX_WAITERS)))) {

solves the problem, but it's not obvious why and it wreckages the
"stale WAITERS bit" case.

What happens is, that due to the WAITERS bit being set (T2 is blocked
on that futex) it enforces T3 to go through lookup_pi_state(), which
in the above case returns an existing pi_state and therefor forces T3
to legitimately fight with T2 over the ownership of the pi_state (via
pi_state-&gt;mutex). Probelm solved!

Though that does not work for the "WAITERS bit is stale" problem
because if lookup_pi_state() does not find existing pi_state it
returns -ERSCH (due to TID == 0) which causes futex_lock_pi() to
return -ESRCH to user space because the OWNER_DIED bit is not set.

Now there is a different solution to that problem. Do not look at the
user space value at all and enforce a lookup of possibly available
pi_state. If pi_state can be found, then the new incoming locker T3
blocks on that pi_state and legitimately races with T2 to acquire the
rt_mutex and the pi_state and therefor the proper ownership of the
user space futex.

lookup_pi_state() has the correct order of checks. It first tries to
find a pi_state associated with the user space futex and only if that
fails it checks for futex TID value = 0. If no pi_state is available
nothing can create new state at that point because this happens with
the hash bucket lock held.

So the above scenario changes to:

T1 lock_futex_pi(F);

T2 lock_futex_pi(F);
   --&gt; T2 blocks on the futex and creates pi_state which is associated
       to T1.

T1 exits
   --&gt; exit_robust_list() runs
       --&gt; Futex F userspace value TID field is set to 0 and
           FUTEX_OWNER_DIED bit is set.

T3 lock_futex_pi(F);
   --&gt; Finds pi_state and blocks on pi_state-&gt;rt_mutex

T1 --&gt; exit_pi_state_list()
       --&gt; Transfers pi_state to waiter T2 and wakes it via
       	   rt_mutex_unlock(&amp;pi_state-&gt;mutex)

T2 --&gt; acquires pi_state-&gt;mutex and gains ownership of the pi_state
   --&gt; Claims ownership of the futex and sets its own TID into the
       userspace TID field of futex F
   --&gt; returns to user space

This covers all gazillion points on which T3 might come in between
T1's exit_robust_list() clearing the TID field and T2 fixing it up. It
also solves the "WAITERS bit stale" problem by forcing the take over.

Another benefit of changing the code this way is that it makes it less
dependent on untrusted user space values and therefor minimizes the
possible wreckage which might be inflicted.

As usual after staring for too long at the futex code my brain hurts
so much that I really want to ditch that whole optimization of
avoiding the syscall for the non contended case for PI futexes and rip
out the maze of corner case handling code. Unfortunately we can't as
user space relies on that existing behaviour, but at least thinking
about it helps me to preserve my mental sanity. Maybe we should
nevertheless :)

Reported-and-tested-by: Siddhesh Poyarekar &lt;siddhesh.poyarekar@gmail.com&gt;
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1210232138540.2756@ionos
Acked-by: Darren Hart &lt;dvhart@linux.intel.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
