<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel, branch v2.6.34.9</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/kernel?h=v2.6.34.9</id>
<link rel='self' href='https://git.amat.us/linux/atom/kernel?h=v2.6.34.9'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2011-04-17T20:16:18Z</updated>
<entry>
<title>Relax si_code check in rt_sigqueueinfo and rt_tgsigqueueinfo</title>
<updated>2011-04-17T20:16:18Z</updated>
<author>
<name>Roland Dreier</name>
<email>roland@purestorage.com</email>
</author>
<published>2011-03-28T21:13:35Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=40b77a71b25fd1ec84fdc3abc2f928a872884b10'/>
<id>urn:sha1:40b77a71b25fd1ec84fdc3abc2f928a872884b10</id>
<content type='text'>
commit 243b422af9ea9af4ead07a8ad54c90d4f9b6081a upstream.

Commit da48524eb206 ("Prevent rt_sigqueueinfo and rt_tgsigqueueinfo
from spoofing the signal code") made the check on si_code too strict.
There are several legitimate places where glibc wants to queue a
negative si_code different from SI_QUEUE:

 - This was first noticed with glibc's aio implementation, which wants
   to queue a signal with si_code SI_ASYNCIO; the current kernel
   causes glibc's tst-aio4 test to fail because rt_sigqueueinfo()
   fails with EPERM.

 - Further examination of the glibc source shows that getaddrinfo_a()
   wants to use SI_ASYNCNL (which the kernel does not even define).
   The timer_create() fallback code wants to queue signals with SI_TIMER.

As suggested by Oleg Nesterov &lt;oleg@redhat.com&gt;, loosen the check to
forbid only the problematic SI_TKILL case.

Reported-by: Klaus Dittrich &lt;kladit@arcor.de&gt;
Acked-by: Julien Tinnes &lt;jln@google.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>Prevent rt_sigqueueinfo and rt_tgsigqueueinfo from spoofing the signal code</title>
<updated>2011-04-17T20:16:17Z</updated>
<author>
<name>Julien Tinnes</name>
<email>jln@google.com</email>
</author>
<published>2011-03-18T22:05:21Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=9eab773eb3eda5d8336a8f0ac68c5e1faa5f88d5'/>
<id>urn:sha1:9eab773eb3eda5d8336a8f0ac68c5e1faa5f88d5</id>
<content type='text'>
commit da48524eb20662618854bb3df2db01fc65f3070c upstream.

Userland should be able to trust the pid and uid of the sender of a
signal if the si_code is SI_TKILL.

Unfortunately, the kernel has historically allowed sigqueueinfo() to
send any si_code at all (as long as it was negative - to distinguish it
from kernel-generated signals like SIGILL etc), so it could spoof a
SI_TKILL with incorrect siginfo values.

Happily, it looks like glibc has always set si_code to the appropriate
SI_QUEUE, so there are probably no actual user code that ever uses
anything but the appropriate SI_QUEUE flag.

So just tighten the check for si_code (we used to allow any negative
value), and add a (one-time) warning in case there are binaries out
there that might depend on using other si_code values.

Signed-off-by: Julien Tinnes &lt;jln@google.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>posix-cpu-timers: workaround to suppress the problems with mt exec</title>
<updated>2011-04-17T20:16:15Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2010-11-05T15:53:42Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=ebc34eb7134b13d6a886f185e6ffd4535e7da220'/>
<id>urn:sha1:ebc34eb7134b13d6a886f185e6ffd4535e7da220</id>
<content type='text'>
commit e0a70217107e6f9844628120412cb27bb4cea194 upstream.

posix-cpu-timers.c correctly assumes that the dying process does
posix_cpu_timers_exit_group() and removes all !CPUCLOCK_PERTHREAD
timers from signal-&gt;cpu_timers list.

But, it also assumes that timer-&gt;it.cpu.task is always the group
leader, and thus the dead -&gt;task means the dead thread group.

This is obviously not true after de_thread() changes the leader.
After that almost every posix_cpu_timer_ method has problems.

It is not simple to fix this bug correctly. First of all, I think
that timer-&gt;it.cpu should use struct pid instead of task_struct.
Also, the locking should be reworked completely. In particular,
tasklist_lock should not be used at all. This all needs a lot of
nontrivial and hard-to-test changes.

Change __exit_signal() to do posix_cpu_timers_exit_group() when
the old leader dies during exec. This is not the fix, just the
temporary hack to hide the problem for 2.6.37 and stable. IOW,
this is obviously wrong but this is what we currently have anyway:
cpu timers do not work after mt exec.

In theory this change adds another race. The exiting leader can
detach the timers which were attached to the new leader. However,
the window between de_thread() and release_task() is small, we
can pretend that sys_timer_create() was called before de_thread().

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>tracing: Fix panic when lseek() called on "trace" opened for writing</title>
<updated>2011-04-17T20:16:10Z</updated>
<author>
<name>Slava Pestov</name>
<email>slavapestov@google.com</email>
</author>
<published>2010-11-24T23:13:16Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=ae621e2431fa476dcbe1daf1d9d278085b0e95e3'/>
<id>urn:sha1:ae621e2431fa476dcbe1daf1d9d278085b0e95e3</id>
<content type='text'>
commit 364829b1263b44aa60383824e4c1289d83d78ca7 upstream.

The file_ops struct for the "trace" special file defined llseek as seq_lseek().
However, if the file was opened for writing only, seq_open() was not called,
and the seek would dereference a null pointer, file-&gt;private_data.

This patch introduces a new wrapper for seq_lseek() which checks if the file
descriptor is opened for reading first. If not, it does nothing.

Signed-off-by: Slava Pestov &lt;slavapestov@google.com&gt;
LKML-Reference: &lt;1290640396-24179-1-git-send-email-slavapestov@google.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>PM / Hibernate: Fix PM_POST_* notification with user-space suspend</title>
<updated>2011-04-17T20:16:09Z</updated>
<author>
<name>Takashi Iwai</name>
<email>tiwai@suse.de</email>
</author>
<published>2010-12-09T23:16:39Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=e3dc1fabddcbc9a4d3fabb74671d521e68da2206'/>
<id>urn:sha1:e3dc1fabddcbc9a4d3fabb74671d521e68da2206</id>
<content type='text'>
commit 1497dd1d29c6a53fcd3c80f7ac8d0e0239e7389e upstream.

The user-space hibernation sends a wrong notification after the image
restoration because of thinko for the file flag check.  RDONLY
corresponds to hibernation and WRONLY to restoration, confusingly.

Signed-off-by: Takashi Iwai &lt;tiwai@suse.de&gt;
Signed-off-by: Rafael J. Wysocki &lt;rjw@sisk.pl&gt;
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>nohz: Fix get_next_timer_interrupt() vs cpu hotplug</title>
<updated>2011-04-17T20:16:07Z</updated>
<author>
<name>Heiko Carstens</name>
<email>heiko.carstens@de.ibm.com</email>
</author>
<published>2010-12-01T09:11:09Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=ea9f5e5405fac791c3e1cde11e6254b3c1481fde'/>
<id>urn:sha1:ea9f5e5405fac791c3e1cde11e6254b3c1481fde</id>
<content type='text'>
commit dbd87b5af055a0cc9bba17795c9a2b0d17795389 upstream.

This fixes a bug as seen on 2.6.32 based kernels where timers got
enqueued on offline cpus.

If a cpu goes offline it might still have pending timers. These will
be migrated during CPU_DEAD handling after the cpu is offline.
However while the cpu is going offline it will schedule the idle task
which will then call tick_nohz_stop_sched_tick().

That function in turn will call get_next_timer_intterupt() to figure
out if the tick of the cpu can be stopped or not. If it turns out that
the next tick is just one jiffy off (delta_jiffies == 1)
tick_nohz_stop_sched_tick() incorrectly assumes that the tick should
not stop and takes an early exit and thus it won't update the load
balancer cpu.

Just afterwards the cpu will be killed and the load balancer cpu could
be the offline cpu.

On 2.6.32 based kernel get_nohz_load_balancer() gets called to decide
on which cpu a timer should be enqueued (see __mod_timer()). Which
leads to the possibility that timers get enqueued on an offline cpu.
These will never expire and can cause a system hang.

This has been observed 2.6.32 kernels. On current kernels
__mod_timer() uses get_nohz_timer_target() which doesn't have that
problem. However there might be other problems because of the too
early exit tick_nohz_stop_sched_tick() in case a cpu goes offline.

The easiest and probably safest fix seems to be to let
get_next_timer_interrupt() just lie and let it say there isn't any
pending timer if the current cpu is offline.

I also thought of moving migrate_[hr]timers() from CPU_DEAD to
CPU_DYING, but seeing that there already have been fixes at least in
the hrtimer code in this area I'm afraid that this could add new
subtle bugs.

Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
LKML-Reference: &lt;20101201091109.GA8984@osiris.boeblingen.de.ibm.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>nohz: Fix printk_needs_cpu() return value on offline cpus</title>
<updated>2011-04-17T20:16:07Z</updated>
<author>
<name>Heiko Carstens</name>
<email>heiko.carstens@de.ibm.com</email>
</author>
<published>2010-11-26T12:00:59Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=fbf23913c63625504579db05abe6a2c2f3c7e66a'/>
<id>urn:sha1:fbf23913c63625504579db05abe6a2c2f3c7e66a</id>
<content type='text'>
commit 61ab25447ad6334a74e32f60efb135a3467223f8 upstream.

This patch fixes a hang observed with 2.6.32 kernels where timers got enqueued
on offline cpus.

printk_needs_cpu() may return 1 if called on offline cpus. When a cpu gets
offlined it schedules the idle process which, before killing its own cpu, will
call tick_nohz_stop_sched_tick(). That function in turn will call
printk_needs_cpu() in order to check if the local tick can be disabled. On
offline cpus this function should naturally return 0 since regardless if the
tick gets disabled or not the cpu will be dead short after. That is besides the
fact that __cpu_disable() should already have made sure that no interrupts on
the offlined cpu will be delivered anyway.

In this case it prevents tick_nohz_stop_sched_tick() to call
select_nohz_load_balancer(). No idea if that really is a problem. However what
made me debug this is that on 2.6.32 the function get_nohz_load_balancer() is
used within __mod_timer() to select a cpu on which a timer gets enqueued. If
printk_needs_cpu() returns 1 then the nohz_load_balancer cpu doesn't get
updated when a cpu gets offlined. It may contain the cpu number of an offline
cpu. In turn timers get enqueued on an offline cpu and not very surprisingly
they never expire and cause system hangs.

This has been observed 2.6.32 kernels. On current kernels __mod_timer() uses
get_nohz_timer_target() which doesn't have that problem. However there might be
other problems because of the too early exit tick_nohz_stop_sched_tick() in
case a cpu goes offline.

Easiest way to fix this is just to test if the current cpu is offline and call
printk_tick() directly which clears the condition.

Alternatively I tried a cpu hotplug notifier which would clear the condition,
however between calling the notifier function and printk_needs_cpu() something
could have called printk() again and the problem is back again. This seems to
be the safest fix.

Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
LKML-Reference: &lt;20101126120235.406766476@de.ibm.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>sysctl: fix min/max handling in __do_proc_doulongvec_minmax()</title>
<updated>2011-04-17T20:15:55Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-10-07T19:59:29Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=9b8796a9ce81ddc365946858d85a7656e66b0e59'/>
<id>urn:sha1:9b8796a9ce81ddc365946858d85a7656e66b0e59</id>
<content type='text'>
commit 27b3d80a7b6adcf069b5e869e4efcc3a79f88a91 upstream.

When proc_doulongvec_minmax() is used with an array of longs, and no
min/max check requested (.extra1 or .extra2 being NULL), we dereference a
NULL pointer for the second element of the array.

Noticed while doing some changes in network stack for the "16TB problem"

Fix is to not change min &amp; max pointers in __do_proc_doulongvec_minmax(),
so that all elements of the vector share an unique min/max limit, like
proc_dointvec_minmax().

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Americo Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>sysctl: min/max bounds are optional</title>
<updated>2011-04-17T20:15:55Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-10-15T21:34:12Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=ba6929c72ac7d900504d9c355b2e84fe27533198'/>
<id>urn:sha1:ba6929c72ac7d900504d9c355b2e84fe27533198</id>
<content type='text'>
commit a9febbb4bd1302b6f01aa1203b0a804e4e5c9e25 upstream

sysctl check complains with a WARN() when proc_doulongvec_minmax() or
proc_doulongvec_ms_jiffies_minmax() are used by a vector of longs (with
more than one element), with no min or max value specified.

This is unexpected, given we had a bug on this min/max handling :)

Reported-by: Jiri Slaby &lt;jirislaby@gmail.com&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Acked-by: WANG Cong &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>do_exit(): make sure that we run with get_fs() == USER_DS</title>
<updated>2011-04-17T20:15:51Z</updated>
<author>
<name>Nelson Elhage</name>
<email>nelhage@ksplice.com</email>
</author>
<published>2010-12-02T22:31:21Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=ab1c9b9db0831464307f130ba0fd13a27c852e4f'/>
<id>urn:sha1:ab1c9b9db0831464307f130ba0fd13a27c852e4f</id>
<content type='text'>
commit 33dd94ae1ccbfb7bf0fb6c692bc3d1c4269e6177 upstream.

If a user manages to trigger an oops with fs set to KERNEL_DS, fs is not
otherwise reset before do_exit().  do_exit may later (via mm_release in
fork.c) do a put_user to a user-controlled address, potentially allowing
a user to leverage an oops into a controlled write into kernel memory.

This is only triggerable in the presence of another bug, but this
potentially turns a lot of DoS bugs into privilege escalations, so it's
worth fixing.  I have proof-of-concept code which uses this bug along
with CVE-2010-3849 to write a zero to an arbitrary kernel address, so
I've tested that this is not theoretical.

A more logical place to put this fix might be when we know an oops has
occurred, before we call do_exit(), but that would involve changing
every architecture, in multiple places.

Let's just stick it in do_exit instead.

[akpm@linux-foundation.org: update code comment]
Signed-off-by: Nelson Elhage &lt;nelhage@ksplice.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
</feed>
