<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel, branch v3.4.65</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/kernel?h=v3.4.65</id>
<link rel='self' href='https://git.amat.us/linux/atom/kernel?h=v3.4.65'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2013-10-01T16:10:52Z</updated>
<entry>
<title>perf: Fix perf_cgroup_switch for sw-events</title>
<updated>2013-10-01T16:10:52Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2012-10-02T13:41:23Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=1d48ca6f38fa39298474708abebeffef4ef2cd2d'/>
<id>urn:sha1:1d48ca6f38fa39298474708abebeffef4ef2cd2d</id>
<content type='text'>
commit 95cf59ea72331d0093010543b8951bb43f262cac upstream.

Jiri reported that he could trigger the WARN_ON_ONCE() in
perf_cgroup_switch() using sw-events. This is because sw-events share
a cpuctx with multiple PMUs.

Use the -&gt;unique_pmu pointer to limit the pmu iteration to unique
cpuctx instances.

Reported-and-Tested-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/n/tip-so7wi2zf3jjzrwcutm2mkz0j@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>perf: Clarify perf_cpu_context::active_pmu usage by renaming it to ::unique_pmu</title>
<updated>2013-10-01T16:10:52Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2012-10-02T13:38:52Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=2cd21fa1b54efaf6b5912ef2833fa474fdcf92b7'/>
<id>urn:sha1:2cd21fa1b54efaf6b5912ef2833fa474fdcf92b7</id>
<content type='text'>
commit 3f1f33206c16c7b3839d71372bc2ac3f305aa802 upstream.

Stephane thought the perf_cpu_context::active_pmu name confusing and
suggested using 'unique_pmu' instead.

This pointer is a pointer to a 'random' pmu sharing the cpuctx
instance, therefore limiting a for_each_pmu loop to those where
cpuctx-&gt;unique_pmu matches the pmu we get a loop over unique cpuctx
instances.

Suggested-by: Stephane Eranian &lt;eranian@google.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/n/tip-kxyjqpfj2fn9gt7kwu5ag9ks@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cgroup: fail if monitored file and event_control are in different cgroup</title>
<updated>2013-10-01T16:10:51Z</updated>
<author>
<name>Li Zefan</name>
<email>lizefan@huawei.com</email>
</author>
<published>2013-02-18T06:13:35Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=2927937899b958de968119856ba659d8a4eff037'/>
<id>urn:sha1:2927937899b958de968119856ba659d8a4eff037</id>
<content type='text'>
commit f169007b2773f285e098cb84c74aac0154d65ff7 upstream.

If we pass fd of memory.usage_in_bytes of cgroup A to cgroup.event_control
of cgroup B, then we won't get memory usage notification from A but B!

What's worse, if A and B are in different mount hierarchy, we'll end up
accessing NULL pointer!

Disallow this kind of invalid usage.

Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Acked-by: Kirill A. Shutemov &lt;kirill@shutemov.name&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Weng Meiling &lt;wengmeiling.weng@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sched/fair: Fix small race where child-&gt;se.parent,cfs_rq might point to invalid ones</title>
<updated>2013-10-01T16:10:51Z</updated>
<author>
<name>Daisuke Nishimura</name>
<email>nishimura@mxp.nes.nec.co.jp</email>
</author>
<published>2013-09-10T09:16:36Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=62875332eafea9ee150a6037c0a1a20669e02aa1'/>
<id>urn:sha1:62875332eafea9ee150a6037c0a1a20669e02aa1</id>
<content type='text'>
commit 6c9a27f5da9609fca46cb2b183724531b48f71ad upstream.

There is a small race between copy_process() and cgroup_attach_task()
where child-&gt;se.parent,cfs_rq points to invalid (old) ones.

        parent doing fork()      | someone moving the parent to another cgroup
  -------------------------------+---------------------------------------------
    copy_process()
      + dup_task_struct()
        -&gt; parent-&gt;se is copied to child-&gt;se.
           se.parent,cfs_rq of them point to old ones.

                                     cgroup_attach_task()
                                       + cgroup_task_migrate()
                                         -&gt; parent-&gt;cgroup is updated.
                                       + cpu_cgroup_attach()
                                         + sched_move_task()
                                           + task_move_group_fair()
                                             +- set_task_rq()
                                                -&gt; se.parent,cfs_rq of parent
                                                   are updated.

      + cgroup_fork()
        -&gt; parent-&gt;cgroup is copied to child-&gt;cgroup. (*1)
      + sched_fork()
        + task_fork_fair()
          -&gt; se.parent,cfs_rq of child are accessed
             while they point to old ones. (*2)

In the worst case, this bug can lead to "use-after-free" and cause a panic,
because it's new cgroup's refcount that is incremented at (*1),
so the old cgroup(and related data) can be freed before (*2).

In fact, a panic caused by this bug was originally caught in RHEL6.4.

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [&lt;ffffffff81051e3e&gt;] sched_slice+0x6e/0xa0
    [...]
    Call Trace:
     [&lt;ffffffff81051f25&gt;] place_entity+0x75/0xa0
     [&lt;ffffffff81056a3a&gt;] task_fork_fair+0xaa/0x160
     [&lt;ffffffff81063c0b&gt;] sched_fork+0x6b/0x140
     [&lt;ffffffff8106c3c2&gt;] copy_process+0x5b2/0x1450
     [&lt;ffffffff81063b49&gt;] ? wake_up_new_task+0xd9/0x130
     [&lt;ffffffff8106d2f4&gt;] do_fork+0x94/0x460
     [&lt;ffffffff81072a9e&gt;] ? sys_wait4+0xae/0x100
     [&lt;ffffffff81009598&gt;] sys_clone+0x28/0x30
     [&lt;ffffffff8100b393&gt;] stub_clone+0x13/0x20
     [&lt;ffffffff8100b072&gt;] ? system_call_fastpath+0x16/0x1b

Signed-off-by: Daisuke Nishimura &lt;nishimura@mxp.nes.nec.co.jp&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/039601ceae06$733d3130$59b79390$@mxp.nes.nec.co.jp
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>workqueue: consider work function when searching for busy work items</title>
<updated>2013-08-29T16:50:12Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-12-18T18:35:02Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=55e3e1f419f0c387a2b971cc181b8dea1b099d1d'/>
<id>urn:sha1:55e3e1f419f0c387a2b971cc181b8dea1b099d1d</id>
<content type='text'>
commit a2c1c57be8d9fd5b716113c8991d3d702eeacf77 upstream.

To avoid executing the same work item concurrenlty, workqueue hashes
currently busy workers according to their current work items and looks
up the the table when it wants to execute a new work item.  If there
already is a worker which is executing the new work item, the new item
is queued to the found worker so that it gets executed only after the
current execution finishes.

Unfortunately, a work item may be freed while being executed and thus
recycled for different purposes.  If it gets recycled for a different
work item and queued while the previous execution is still in
progress, workqueue may make the new work item wait for the old one
although the two aren't really related in any way.

In extreme cases, this false dependency may lead to deadlock although
it's extremely unlikely given that there aren't too many self-freeing
work item users and they usually don't wait for other work items.

To alleviate the problem, record the current work function in each
busy worker and match it together with the work item address in
find_worker_executing_work().  While this isn't complete, it ensures
that unrelated work items don't interact with each other and in the
very unlikely case where a twisted wq user triggers it, it's always
onto itself making the culprit easy to spot.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Andrey Isakov &lt;andy51@gmx.ru&gt;
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51701
[lizf: Backported to 3.4:
 - Adjust context
 - Incorporate earlier logging cleanup in process_one_work() from
   044c782ce3a9 ('workqueue: fix checkpatch issues')]
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>workqueue: fix possible stall on try_to_grab_pending() of a delayed work item</title>
<updated>2013-08-29T16:50:11Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2012-09-18T17:40:00Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=31eafff4382b0f20edce1afea8e2d288c6c7187c'/>
<id>urn:sha1:31eafff4382b0f20edce1afea8e2d288c6c7187c</id>
<content type='text'>
commit 3aa62497594430ea522050b75c033f71f2c60ee6 upstream.

Currently, when try_to_grab_pending() grabs a delayed work item, it
leaves its linked work items alone on the delayed_works.  The linked
work items are always NO_COLOR and will cause future
cwq_activate_first_delayed() increase cwq-&gt;nr_active incorrectly, and
may cause the whole cwq to stall.  For example,

state: cwq-&gt;max_active = 1, cwq-&gt;nr_active = 1
       one work in cwq-&gt;pool, many in cwq-&gt;delayed_works.

step1: try_to_grab_pending() removes a work item from delayed_works
       but leaves its NO_COLOR linked work items on it.

step2: Later on, cwq_activate_first_delayed() activates the linked
       work item increasing -&gt;nr_active.

step3: cwq-&gt;nr_active = 1, but all activated work items of the cwq are
       NO_COLOR.  When they finish, cwq-&gt;nr_active will not be
       decreased due to NO_COLOR, and no further work items will be
       activated from cwq-&gt;delayed_works. the cwq stalls.

Fix it by ensuring the target work item is activated before stealing
PENDING in try_to_grab_pending().  This ensures that all the linked
work items are activated without incorrectly bumping cwq-&gt;nr_active.

tj: Updated comment and description.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
[lizf: backported to 3.4: adjust context]
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>futex: Take hugepages into account when generating futex_key</title>
<updated>2013-08-20T15:26:28Z</updated>
<author>
<name>Zhang Yi</name>
<email>wetpzy@gmail.com</email>
</author>
<published>2013-06-25T13:19:31Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=a42efb79d54d9a13c8f68df122c832bca08b74ae'/>
<id>urn:sha1:a42efb79d54d9a13c8f68df122c832bca08b74ae</id>
<content type='text'>
commit 13d60f4b6ab5b702dc8d2ee20999f98a93728aec upstream.

The futex_keys of process shared futexes are generated from the page
offset, the mapping host and the mapping index of the futex user space
address. This should result in an unique identifier for each futex.

Though this is not true when futexes are located in different subpages
of an hugepage. The reason is, that the mapping index for all those
futexes evaluates to the index of the base page of the hugetlbfs
mapping. So a futex at offset 0 of the hugepage mapping and another
one at offset PAGE_SIZE of the same hugepage mapping have identical
futex_keys. This happens because the futex code blindly uses
page-&gt;index.

Steps to reproduce the bug:

1. Map a file from hugetlbfs. Initialize pthread_mutex1 at offset 0
   and pthread_mutex2 at offset PAGE_SIZE of the hugetlbfs
   mapping.

   The mutexes must be initialized as PTHREAD_PROCESS_SHARED because
   PTHREAD_PROCESS_PRIVATE mutexes are not affected by this issue as
   their keys solely depend on the user space address.

2. Lock mutex1 and mutex2

3. Create thread1 and in the thread function lock mutex1, which
   results in thread1 blocking on the locked mutex1.

4. Create thread2 and in the thread function lock mutex2, which
   results in thread2 blocking on the locked mutex2.

5. Unlock mutex2. Despite the fact that mutex2 got unlocked, thread2
   still blocks on mutex2 because the futex_key points to mutex1.

To solve this issue we need to take the normal page index of the page
which contains the futex into account, if the futex is in an hugetlbfs
mapping. In other words, we calculate the normal page mapping index of
the subpage in the hugetlbfs mapping.

Mappings which are not based on hugetlbfs are not affected and still
use page-&gt;index.

Thanks to Mel Gorman who provided a patch for adding proper evaluation
functions to the hugetlbfs code to avoid exposing hugetlbfs specific
details to the futex code.

[ tglx: Massaged changelog ]

Signed-off-by: Zhang Yi &lt;zhang.yi20@zte.com.cn&gt;
Reviewed-by: Jiang Biao &lt;jiang.biao2@zte.com.cn&gt;
Tested-by: Ma Chenggong &lt;ma.chenggong@zte.com.cn&gt;
Reviewed-by: 'Mel Gorman' &lt;mgorman@suse.de&gt;
Acked-by: 'Darren Hart' &lt;dvhart@linux.intel.com&gt;
Cc: 'Peter Zijlstra' &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/000101ce71a6%24a83c5880%24f8b50980%24@com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Mike Galbraith &lt;mgalbraith@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</content>
</entry>
<entry>
<title>tracing: Fix fields of struct trace_iterator that are zeroed by mistake</title>
<updated>2013-08-15T05:57:08Z</updated>
<author>
<name>Andrew Vagin</name>
<email>avagin@openvz.org</email>
</author>
<published>2013-08-02T17:16:43Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=99593eb7ca1dd9bfaa431d96e009eda23f001ace'/>
<id>urn:sha1:99593eb7ca1dd9bfaa431d96e009eda23f001ace</id>
<content type='text'>
commit ed5467da0e369e65b247b99eb6403cb79172bcda upstream.

tracing_read_pipe zeros all fields bellow "seq". The declaration contains
a comment about that, but it doesn't help.

The first field is "snapshot", it's true when current open file is
snapshot. Looks obvious, that it should not be zeroed.

The second field is "started". It was converted from cpumask_t to
cpumask_var_t (v2.6.28-4983-g4462344), in other words it was
converted from cpumask to pointer on cpumask.

Currently the reference on "started" memory is lost after the first read
from tracing_read_pipe and a proper object will never be freed.

The "started" is never dereferenced for trace_pipe, because trace_pipe
can't have the TRACE_FILE_ANNOTATE options.

Link: http://lkml.kernel.org/r/1375463803-3085183-1-git-send-email-avagin@openvz.org

Signed-off-by: Andrew Vagin &lt;avagin@openvz.org&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>perf: Use css_tryget() to avoid propping up css refcount</title>
<updated>2013-08-11T22:38:43Z</updated>
<author>
<name>Salman Qazi</name>
<email>sqazi@google.com</email>
</author>
<published>2012-06-14T22:31:09Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=2d8be9cf0668fab848b9b7740f0725ae9181b703'/>
<id>urn:sha1:2d8be9cf0668fab848b9b7740f0725ae9181b703</id>
<content type='text'>
commit 9c5da09d266ca9b32eb16cf940f8161d949c2fe5 upstream.

An rmdir pushes css's ref count to zero.  However, if the associated
directory is open at the time, the dentry ref count is non-zero.  If
the fd for this directory is then passed into perf_event_open, it
does a css_get().  This bounces the ref count back up from zero.  This
is a problem by itself.  But what makes it turn into a crash is the
fact that we end up doing an extra dput, since we perform a dput
when css_put sees the ref count go down to zero.

css_tryget() does not fall into that trap. So, we use that instead.

Reproduction test-case for the bug:

 #include &lt;unistd.h&gt;
 #include &lt;sys/types.h&gt;
 #include &lt;sys/stat.h&gt;
 #include &lt;fcntl.h&gt;
 #include &lt;linux/unistd.h&gt;
 #include &lt;linux/perf_event.h&gt;
 #include &lt;string.h&gt;
 #include &lt;errno.h&gt;
 #include &lt;stdio.h&gt;

 #define PERF_FLAG_PID_CGROUP    (1U &lt;&lt; 2)

 int perf_event_open(struct perf_event_attr *hw_event_uptr,
                     pid_t pid, int cpu, int group_fd, unsigned long flags) {
         return syscall(__NR_perf_event_open,hw_event_uptr, pid, cpu,
                 group_fd, flags);
 }

 /*
  * Directly poke at the perf_event bug, since it's proving hard to repro
  * depending on where in the kernel tree.  what moved?
  */
 int main(int argc, char **argv)
 {
        int fd;
        struct perf_event_attr attr;
        memset(&amp;attr, 0, sizeof(attr));
        attr.exclude_kernel = 1;
        attr.size = sizeof(attr);
        mkdir("/dev/cgroup/perf_event/blah", 0777);
        fd = open("/dev/cgroup/perf_event/blah", O_RDONLY);
        perror("open");
        rmdir("/dev/cgroup/perf_event/blah");
        sleep(2);
        perf_event_open(&amp;attr, fd, 0, -1,  PERF_FLAG_PID_CGROUP);
        perror("perf_event_open");
        close(fd);
        return 0;
 }

Signed-off-by: Salman Qazi &lt;sqazi@google.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Link: http://lkml.kernel.org/r/20120614223108.1025.2503.stgit@dungbeetle.mtv.corp.google.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>perf: Fix event group context move</title>
<updated>2013-08-11T22:38:43Z</updated>
<author>
<name>Jiri Olsa</name>
<email>jolsa@redhat.com</email>
</author>
<published>2013-02-01T10:23:45Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=5cbff87c00cf1b8e3a077465e4ec501dd21651c8'/>
<id>urn:sha1:5cbff87c00cf1b8e3a077465e4ec501dd21651c8</id>
<content type='text'>
commit 0231bb5336758426b44ccd798ccd3c5419c95d58 upstream.

When we have group with mixed events (hw/sw) we want to end up
with group leader being in hw context. So if group leader is
initialy sw event, we move all the events under hw context.

The move is done for each event by removing it from its context
and adding it back into proper one. As a part of the removal the
event is automatically disabled, which is not what we want at
this stage of creating groups.

The fix is to initialize event state after removal from sw
context.

This fix resulted from the following discussion:

  http://thread.gmane.org/gmane.linux.kernel.perf.user/1144

Reported-by: Andreas Hollmann &lt;hollmann@in.tum.de&gt;
Signed-off-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Corey Ashford &lt;cjashfor@linux.vnet.ibm.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Vince Weaver &lt;vince@deater.net&gt;
Link: http://lkml.kernel.org/r/1359714225-4231-1-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
