<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/events, branch v3.15</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/kernel/events?h=v3.15</id>
<link rel='self' href='https://git.amat.us/linux/atom/kernel/events?h=v3.15'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2014-05-19T12:44:56Z</updated>
<entry>
<title>perf: Fix a race between ring_buffer_detach() and ring_buffer_attach()</title>
<updated>2014-05-19T12:44:56Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2014-03-14T09:50:33Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=b69cf53640da2b86439596118cfa95233154ee76'/>
<id>urn:sha1:b69cf53640da2b86439596118cfa95233154ee76</id>
<content type='text'>
Alexander noticed that we use RCU iteration on rb-&gt;event_list but do
not use list_{add,del}_rcu() to add,remove entries to that list, nor
do we observe proper grace periods when re-using the entries.

Merge ring_buffer_detach() into ring_buffer_attach() such that
attaching to the NULL buffer is detaching.

Furthermore, ensure that between any 'detach' and 'attach' of the same
event we observe the required grace period, but only when strictly
required. In effect this means that only ioctl(.request =
PERF_EVENT_IOC_SET_OUTPUT) will wait for a grace period, while the
normal initial attach and final detach will not be delayed.

This patch should, I think, do the right thing under all
circumstances, the 'normal' cases all should never see the extra grace
period, but the two cases:

 1) PERF_EVENT_IOC_SET_OUTPUT on an event which already has a
    ring_buffer set, will now observe the required grace period between
    removing itself from the old and attaching itself to the new buffer.

    This case is 'simple' in that both buffers are present in
    perf_event_set_output() one could think an unconditional
    synchronize_rcu() would be sufficient; however...

 2) an event that has a buffer attached, the buffer is destroyed
    (munmap) and then the event is attached to a new/different buffer
    using PERF_EVENT_IOC_SET_OUTPUT.

    This case is more complex because the buffer destruction does:
      ring_buffer_attach(.rb = NULL)
    followed by the ioctl() doing:
      ring_buffer_attach(.rb = foo);

    and we still need to observe the grace period between these two
    calls due to us reusing the event-&gt;rb_entry list_head.

In order to make 2 happen we use Paul's latest cond_synchronize_rcu()
call.

Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Reported-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20140507123526.GD13658@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>perf: Prevent false warning in perf_swevent_add</title>
<updated>2014-05-19T12:44:55Z</updated>
<author>
<name>Jiri Olsa</name>
<email>jolsa@redhat.com</email>
</author>
<published>2014-04-07T09:04:08Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=39af6b1678afa5880dda7e375cf3f9d395087f6d'/>
<id>urn:sha1:39af6b1678afa5880dda7e375cf3f9d395087f6d</id>
<content type='text'>
The perf cpu offline callback takes down all cpu context
events and releases swhash-&gt;swevent_hlist.

This could race with task context software event being just
scheduled on this cpu via perf_swevent_add while cpu hotplug
code already cleaned up event's data.

The race happens in the gap between the cpu notifier code
and the cpu being actually taken down. Note that only cpu
ctx events are terminated in the perf cpu hotplug code.

It's easily reproduced with:
  $ perf record -e faults perf bench sched pipe

while putting one of the cpus offline:
  # echo 0 &gt; /sys/devices/system/cpu/cpu1/online

Console emits following warning:
  WARNING: CPU: 1 PID: 2845 at kernel/events/core.c:5672 perf_swevent_add+0x18d/0x1a0()
  Modules linked in:
  CPU: 1 PID: 2845 Comm: sched-pipe Tainted: G        W    3.14.0+ #256
  Hardware name: Intel Corporation Montevina platform/To be filled by O.E.M., BIOS AMVACRB1.86C.0066.B00.0805070703 05/07/2008
   0000000000000009 ffff880077233ab8 ffffffff81665a23 0000000000200005
   0000000000000000 ffff880077233af8 ffffffff8104732c 0000000000000046
   ffff88007467c800 0000000000000002 ffff88007a9cf2a0 0000000000000001
  Call Trace:
   [&lt;ffffffff81665a23&gt;] dump_stack+0x4f/0x7c
   [&lt;ffffffff8104732c&gt;] warn_slowpath_common+0x8c/0xc0
   [&lt;ffffffff8104737a&gt;] warn_slowpath_null+0x1a/0x20
   [&lt;ffffffff8110fb3d&gt;] perf_swevent_add+0x18d/0x1a0
   [&lt;ffffffff811162ae&gt;] event_sched_in.isra.75+0x9e/0x1f0
   [&lt;ffffffff8111646a&gt;] group_sched_in+0x6a/0x1f0
   [&lt;ffffffff81083dd5&gt;] ? sched_clock_local+0x25/0xa0
   [&lt;ffffffff811167e6&gt;] ctx_sched_in+0x1f6/0x450
   [&lt;ffffffff8111757b&gt;] perf_event_sched_in+0x6b/0xa0
   [&lt;ffffffff81117a4b&gt;] perf_event_context_sched_in+0x7b/0xc0
   [&lt;ffffffff81117ece&gt;] __perf_event_task_sched_in+0x43e/0x460
   [&lt;ffffffff81096f1e&gt;] ? put_lock_stats.isra.18+0xe/0x30
   [&lt;ffffffff8107b3c8&gt;] finish_task_switch+0xb8/0x100
   [&lt;ffffffff8166a7de&gt;] __schedule+0x30e/0xad0
   [&lt;ffffffff81172dd2&gt;] ? pipe_read+0x3e2/0x560
   [&lt;ffffffff8166b45e&gt;] ? preempt_schedule_irq+0x3e/0x70
   [&lt;ffffffff8166b45e&gt;] ? preempt_schedule_irq+0x3e/0x70
   [&lt;ffffffff8166b464&gt;] preempt_schedule_irq+0x44/0x70
   [&lt;ffffffff816707f0&gt;] retint_kernel+0x20/0x30
   [&lt;ffffffff8109e60a&gt;] ? lockdep_sys_exit+0x1a/0x90
   [&lt;ffffffff812a4234&gt;] lockdep_sys_exit_thunk+0x35/0x67
   [&lt;ffffffff81679321&gt;] ? sysret_check+0x5/0x56

Fixing this by tracking the cpu hotplug state and displaying
the WARN only if current cpu is initialized properly.

Cc: Corey Ashford &lt;cjashfor@linux.vnet.ibm.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: stable@vger.kernel.org
Reported-by: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1396861448-10097-1-git-send-email-jolsa@redhat.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>perf: Limit perf_event_attr::sample_period to 63 bits</title>
<updated>2014-05-19T12:44:55Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2014-05-15T18:23:48Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0819b2e30ccb93edf04876237b6205eef84ec8d2'/>
<id>urn:sha1:0819b2e30ccb93edf04876237b6205eef84ec8d2</id>
<content type='text'>
Vince reported that using a large sample_period (one with bit 63 set)
results in wreckage since while the sample_period is fundamentally
unsigned (negative periods don't make sense) the way we implement
things very much rely on signed logic.

So limit sample_period to 63 bits to avoid tripping over this.

Reported-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/n/tip-p25fhunibl4y3qi0zuqmyf4b@git.kernel.org
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>perf: Fix perf_event_init_context()</title>
<updated>2014-05-07T09:33:15Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2014-05-05T17:12:20Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=ffb4ef21ac4308c2e738e6f83b6741bbc9b4fa3b'/>
<id>urn:sha1:ffb4ef21ac4308c2e738e6f83b6741bbc9b4fa3b</id>
<content type='text'>
perf_pin_task_context() can return NULL but perf_event_init_context()
assumes it will not, correct this.

Reported-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Link: http://lkml.kernel.org/r/20140505171428.GU26782@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf: Fix race in removing an event</title>
<updated>2014-05-07T09:33:14Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2014-05-02T14:56:01Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=46ce0fe97a6be7532ce6126bb26ce89fed81528c'/>
<id>urn:sha1:46ce0fe97a6be7532ce6126bb26ce89fed81528c</id>
<content type='text'>
When removing a (sibling) event we do:

	raw_spin_lock_irq(&amp;ctx-&gt;lock);
	perf_group_detach(event);
	raw_spin_unlock_irq(&amp;ctx-&gt;lock);

	&lt;hole&gt;

	perf_remove_from_context(event);
		raw_spin_lock_irq(&amp;ctx-&gt;lock);
		...
		raw_spin_unlock_irq(&amp;ctx-&gt;lock);

Now, assuming the event is a sibling, it will be 'unreachable' for
things like ctx_sched_out() because that iterates the
groups-&gt;siblings, and we just unhooked the sibling.

So, if during &lt;hole&gt; we get ctx_sched_out(), it will miss the event
and not call event_sched_out() on it, leaving it programmed on the
PMU.

The subsequent perf_remove_from_context() call will find the ctx is
inactive and only call list_del_event() to remove the event from all
other lists.

Hereafter we can proceed to free the event; while still programmed!

Close this hole by moving perf_group_detach() inside the same
ctx-&gt;lock region(s) perf_remove_from_context() has.

The condition on inherited events only in __perf_event_exit_task() is
likely complete crap because non-inherited events are part of groups
too and we're tearing down just the same. But leave that for another
patch.

Most-likely-Fixes: e03a9a55b4e ("perf: Change close() semantics for group events")
Reported-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Tested-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Much-staring-at-traces-by: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Much-staring-at-traces-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20140505093124.GN17778@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm</title>
<updated>2014-04-05T20:20:43Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-04-05T20:20:43Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=2d1eb87ae1e6f3999e77989fd2f831b134270787'/>
<id>urn:sha1:2d1eb87ae1e6f3999e77989fd2f831b134270787</id>
<content type='text'>
Pull ARM changes from Russell King:

 - Perf updates from Will Deacon:
   - Support for Qualcomm Krait processors (run perf on your phone!)
   - Support for Cortex-A12 (run perf stat on your FPGA!)
   - Support for perf_sample_event_took, allowing us to automatically decrease
     the sample rate if we can't handle the PMU interrupts quickly enough
     (run perf record on your FPGA!).

 - Basic uprobes support from David Long:
     This patch series adds basic uprobes support to ARM. It is based on
     patches developed earlier by Rabin Vincent. That approach of adding
     hooks into the kprobes instruction parsing code was not well received.
     This approach separates the ARM instruction parsing code in kprobes out
     into a separate set of functions which can be used by both kprobes and
     uprobes. Both kprobes and uprobes then provide their own semantic action
     tables to process the results of the parsing.

 - ARMv7M (microcontroller) updates from Uwe Kleine-König

 - OMAP DMA updates (recently added Vinod's Ack even though they've been
   sitting in linux-next for a few months) to reduce the reliance of
   omap-dma on the code in arch/arm.

 - SA11x0 changes from Dmitry Eremin-Solenikov and Alexander Shiyan

 - Support for Cortex-A12 CPU

 - Align support for ARMv6 with ARMv7 so they can cooperate better in a
   single zImage.

 - Addition of first AT_HWCAP2 feature bits for ARMv8 crypto support.

 - Removal of IRQ_DISABLED from various ARM files

 - Improved efficiency of virt_to_page() for single zImage

 - Patch from Ulf Hansson to permit runtime PM callbacks to be available for
   AMBA devices for suspend/resume as well.

 - Finally kill asm/system.h on ARM.

* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (89 commits)
  dmaengine: omap-dma: more consolidation of CCR register setup
  dmaengine: omap-dma: move IRQ handling to omap-dma
  dmaengine: omap-dma: move register read/writes into omap-dma.c
  ARM: omap: dma: get rid of 'p' allocation and clean up
  ARM: omap: move dma channel allocation into plat-omap code
  ARM: omap: dma: get rid of errata global
  ARM: omap: clean up DMA register accesses
  ARM: omap: remove almost-const variables
  ARM: omap: remove references to disable_irq_lch
  dmaengine: omap-dma: cleanup errata 3.3 handling
  dmaengine: omap-dma: provide register read/write functions
  dmaengine: omap-dma: use cached CCR value when enabling DMA
  dmaengine: omap-dma: move barrier to omap_dma_start_desc()
  dmaengine: omap-dma: move clnk_ctrl setting to preparation functions
  dmaengine: omap-dma: improve efficiency loading C.SA/C.EI/C.FI registers
  dmaengine: omap-dma: consolidate clearing channel status register
  dmaengine: omap-dma: move CCR buffering disable errata out of the fast path
  dmaengine: omap-dma: provide register definitions
  dmaengine: omap-dma: consolidate setup of CCR
  dmaengine: omap-dma: consolidate setup of CSDP
  ...
</content>
</entry>
<entry>
<title>Merge branch 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2014-04-03T20:05:42Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-04-03T20:05:42Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=32d01dc7be4e725ab85ce1d74e8f4adc02ad68dd'/>
<id>urn:sha1:32d01dc7be4e725ab85ce1d74e8f4adc02ad68dd</id>
<content type='text'>
Pull cgroup updates from Tejun Heo:
 "A lot updates for cgroup:

   - The biggest one is cgroup's conversion to kernfs.  cgroup took
     after the long abandoned vfs-entangled sysfs implementation and
     made it even more convoluted over time.  cgroup's internal objects
     were fused with vfs objects which also brought in vfs locking and
     object lifetime rules.  Naturally, there are places where vfs rules
     don't fit and nasty hacks, such as credential switching or lock
     dance interleaving inode mutex and cgroup_mutex with object serial
     number comparison thrown in to decide whether the operation is
     actually necessary, needed to be employed.

     After conversion to kernfs, internal object lifetime and locking
     rules are mostly isolated from vfs interactions allowing shedding
     of several nasty hacks and overall simplification.  This will also
     allow implmentation of operations which may affect multiple cgroups
     which weren't possible before as it would have required nesting
     i_mutexes.

   - Various simplifications including dropping of module support,
     easier cgroup name/path handling, simplified cgroup file type
     handling and task_cg_lists optimization.

   - Prepatory changes for the planned unified hierarchy, which is still
     a patchset away from being actually operational.  The dummy
     hierarchy is updated to serve as the default unified hierarchy.
     Controllers which aren't claimed by other hierarchies are
     associated with it, which BTW was what the dummy hierarchy was for
     anyway.

   - Various fixes from Li and others.  This pull request includes some
     patches to add missing slab.h to various subsystems.  This was
     triggered xattr.h include removal from cgroup.h.  cgroup.h
     indirectly got included a lot of files which brought in xattr.h
     which brought in slab.h.

  There are several merge commits - one to pull in kernfs updates
  necessary for converting cgroup (already in upstream through
  driver-core), others for interfering changes in the fixes branch"

* 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (74 commits)
  cgroup: remove useless argument from cgroup_exit()
  cgroup: fix spurious lockdep warning in cgroup_exit()
  cgroup: Use RCU_INIT_POINTER(x, NULL) in cgroup.c
  cgroup: break kernfs active_ref protection in cgroup directory operations
  cgroup: fix cgroup_taskset walking order
  cgroup: implement CFTYPE_ONLY_ON_DFL
  cgroup: make cgrp_dfl_root mountable
  cgroup: drop const from @buffer of cftype-&gt;write_string()
  cgroup: rename cgroup_dummy_root and related names
  cgroup: move -&gt;subsys_mask from cgroupfs_root to cgroup
  cgroup: treat cgroup_dummy_root as an equivalent hierarchy during rebinding
  cgroup: remove NULL checks from [pr_cont_]cgroup_{name|path}()
  cgroup: use cgroup_setup_root() to initialize cgroup_dummy_root
  cgroup: reorganize cgroup bootstrapping
  cgroup: relocate setting of CGRP_DEAD
  cpuset: use rcu_read_lock() to protect task_cs()
  cgroup_freezer: document freezer_fork() subtleties
  cgroup: update cgroup_transfer_tasks() to either succeed or fail
  cgroup: drop task_lock() protection around task-&gt;cgroups
  cgroup: update how a newly forked task gets associated with css_set
  ...
</content>
</entry>
<entry>
<title>uprobes: allow ignoring of probe hits</title>
<updated>2014-03-18T20:39:34Z</updated>
<author>
<name>David A. Long</name>
<email>dave.long@linaro.org</email>
</author>
<published>2014-02-03T19:25:49Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=6fe50a28ba6e5fafb4a549dea666dd15297dd8bd'/>
<id>urn:sha1:6fe50a28ba6e5fafb4a549dea666dd15297dd8bd</id>
<content type='text'>
Allow arches to decided to ignore a probe hit.  ARM will use this to
only call handlers if the conditions to execute a conditionally executed
instruction are satisfied.

Signed-off-by: David A. Long &lt;dave.long@linaro.org&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf: Optimize group_sched_in()</title>
<updated>2014-02-27T11:43:26Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2014-02-24T11:43:31Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=4a2345937c17722bd2979f662ae909846b4a052a'/>
<id>urn:sha1:4a2345937c17722bd2979f662ae909846b4a052a</id>
<content type='text'>
Use the ctx pmu instead of the event pmu.

When a group leader is a software event but the group contains
hardware events, the entire group is on the hardware PMU.

Using the hardware PMU for the transaction makes most sense since
that's the most expensive one to programm (and software PMUs generally
don't have TXN support anyway).

Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Link: http://lkml.kernel.org/n/tip-sctoo9t2f3nn2c9g568928q3@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf: Remove redundant PMU assignment</title>
<updated>2014-02-27T11:43:24Z</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2014-02-10T17:44:19Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=fdded676c3ef680bf1abc415d307d7e69a6768d1'/>
<id>urn:sha1:fdded676c3ef680bf1abc415d307d7e69a6768d1</id>
<content type='text'>
Currently perf_branch_stack_sched_in iterates over the set of pmus,
checks that each pmu has a flush_branch_stack callback, then overwrites
the pmu before calling the callback. This is either redundant or broken.

In systems with a single hw pmu, pmu == cpuctx-&gt;ctx.pmu, and thus the
assignment is redundant.

In systems with multiple hw pmus (i.e. multiple pmus with task_ctx_nr ==
perf_hw_context) the pmus share the same perf_cpu_context. Thus the
assignment can cause one of the pmus to flush its branch stack
repeatedly rather than causing each of the pmus to flush their branch
stacks. Worse still, if only some pmus have the callback the assignment
can result in a branch to NULL.

This patch removes the redundant assignment.

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Link: http://lkml.kernel.org/r/1392054264-23570-3-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
