<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel, branch v3.1.1</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/kernel?h=v3.1.1</id>
<link rel='self' href='https://git.amat.us/linux/atom/kernel?h=v3.1.1'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2011-11-11T17:44:47Z</updated>
<entry>
<title>PM / Suspend: Off by one in pm_suspend()</title>
<updated>2011-11-11T17:44:47Z</updated>
<author>
<name>Dan Carpenter</name>
<email>dan.carpenter@oracle.com</email>
</author>
<published>2011-09-21T18:55:04Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=ca9b9d7bf6fc89a8a72c6aa37c683d02b7782b0b'/>
<id>urn:sha1:ca9b9d7bf6fc89a8a72c6aa37c683d02b7782b0b</id>
<content type='text'>
commit 528f7ce6e439edeac38f6b3f8561f1be129b5e91 upstream.

In enter_state() we use "state" as an offset for the pm_states[]
array.  The pm_states[] array only has PM_SUSPEND_MAX elements so
this test is off by one.

Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Rafael J. Wysocki &lt;rjw@sisk.pl&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;


</content>
</entry>
<entry>
<title>genirq: Add IRQF_RESUME_EARLY and resume such IRQs earlier</title>
<updated>2011-11-11T17:43:13Z</updated>
<author>
<name>Ian Campbell</name>
<email>ian.campbell@citrix.com</email>
</author>
<published>2011-10-03T14:37:00Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=cce574ff161d0f5e4263ed28a4a3785220802922'/>
<id>urn:sha1:cce574ff161d0f5e4263ed28a4a3785220802922</id>
<content type='text'>
commit 9bab0b7fbaceec47d32db51cd9e59c82fb071f5a upstream.

This adds a mechanism to resume selected IRQs during syscore_resume
instead of dpm_resume_noirq.

Under Xen we need to resume IRQs associated with IPIs early enough
that the resched IPI is unmasked and we can therefore schedule
ourselves out of the stop_machine where the suspend/resume takes
place.

This issue was introduced by 676dc3cf5bc3 "xen: Use IRQF_FORCE_RESUME".

Signed-off-by: Ian Campbell &lt;ian.campbell@citrix.com&gt;
Cc: Rafael J. Wysocki &lt;rjw@sisk.pl&gt;
Cc: Jeremy Fitzhardinge &lt;Jeremy.Fitzhardinge@citrix.com&gt;
Cc: xen-devel &lt;xen-devel@lists.xensource.com&gt;
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Link: http://lkml.kernel.org/r/1318713254.11016.52.camel@dagon.hellion.org.uk
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>tracing: Fix returning of duplicate data after EOF in trace_pipe_raw</title>
<updated>2011-11-11T17:43:13Z</updated>
<author>
<name>Steven Rostedt</name>
<email>srostedt@redhat.com</email>
</author>
<published>2011-10-14T14:44:25Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=734c81f73201c96a4669793c27f363a334770d79'/>
<id>urn:sha1:734c81f73201c96a4669793c27f363a334770d79</id>
<content type='text'>
commit 436fc280261dcfce5af38f08b89287750dc91cd2 upstream.

The trace_pipe_raw handler holds a cached page from the time the file
is opened to the time it is closed. The cached page is used to handle
the case of the user space buffer being smaller than what was read from
the ring buffer. The left over buffer is held in the cache so that the
next read will continue where the data left off.

After EOF is returned (no more data in the buffer), the index of
the cached page is set to zero. If a user app reads the page again
after EOF, the check in the buffer will see that the cached page
is less than page size and will return the cached page again. This
will cause reading the trace_pipe_raw again after EOF to return
duplicate data, making the output look like the time went backwards
but instead data is just repeated.

The fix is to not reset the index right after all data is read
from the cache, but to reset it after all data is read and more
data exists in the ring buffer.

Reported-by: Jeremy Eder &lt;jeder@redhat.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>ftrace/kprobes: Fix not to delete probes if in use</title>
<updated>2011-11-11T17:43:13Z</updated>
<author>
<name>Masami Hiramatsu</name>
<email>masami.hiramatsu.pt@hitachi.com</email>
</author>
<published>2011-10-04T10:44:38Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=16fd0c4105bb27e0050ce4a5d948408f7369ad63'/>
<id>urn:sha1:16fd0c4105bb27e0050ce4a5d948408f7369ad63</id>
<content type='text'>
commit 02ca1521ad404cf566e0075848f80d064c0a0503 upstream.

Fix kprobe-tracer not to delete a probe if the probe is in use.
In that case, delete operation will return -EBUSY.

This bug can cause a kernel panic if enabled probes are deleted
during perf record.

(Add some probes on functions)
sh-4.2# perf probe --del probe:\*
sh-4.2# exit
(kernel panic)

This is originally reported on the fedora bugzilla:

 https://bugzilla.redhat.com/show_bug.cgi?id=742383

I've also checked that this problem doesn't happen on
tracepoints when module removing because perf event
locks target module.

$ sudo ./perf record -e xfs:\* -aR sh
sh-4.2# rmmod xfs
ERROR: Module xfs is in use
sh-4.2# exit
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.203 MB perf.data (~8862 samples) ]

Signed-off-by: Masami Hiramatsu &lt;masami.hiramatsu.pt@hitachi.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Frank Ch. Eigler &lt;fche@redhat.com&gt;
Link: http://lkml.kernel.org/r/20111004104438.14591.6553.stgit@fedora15
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>time: Change jiffies_to_clock_t() argument type to unsigned long</title>
<updated>2011-11-11T17:43:10Z</updated>
<author>
<name>hank</name>
<email>pyu@redhat.com</email>
</author>
<published>2011-09-20T20:53:39Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=10bb7e9e824bc2d727a7b5a0813ded0461a293e9'/>
<id>urn:sha1:10bb7e9e824bc2d727a7b5a0813ded0461a293e9</id>
<content type='text'>
commit cbbc719fccdb8cbd87350a05c0d33167c9b79365 upstream.

The parameter's origin type is long. On an i386 architecture, it can
easily be larger than 0x80000000, causing this function to convert it
to a sign-extended u64 type.

Change the type to unsigned long so we get the correct result.

Signed-off-by: hank &lt;pyu@redhat.com&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
[ build fix ]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>kmod: prevent kmod_loop_msg overflow in __request_module()</title>
<updated>2011-11-11T17:43:01Z</updated>
<author>
<name>Jiri Kosina</name>
<email>jkosina@suse.cz</email>
</author>
<published>2011-10-26T02:40:39Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=112e8e4019fdf3fc987cf0f839a2bfbd6c169e60'/>
<id>urn:sha1:112e8e4019fdf3fc987cf0f839a2bfbd6c169e60</id>
<content type='text'>
commit 37252db6aa576c34fd794a5a54fb32d7a8b3a07a upstream.

Due to post-increment in condition of kmod_loop_msg in __request_module(),
the system log can be spammed by much more than 5 instances of the 'runaway
loop' message if the number of events triggering it makes the kmod_loop_msg
to overflow.

Fix that by making sure we never increment it past the threshold.

Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>cputimer: Cure lock inversion</title>
<updated>2011-10-18T09:36:59Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-10-17T09:50:30Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=bcd5cff7216f9b2de0a148cc355eac199dc6f1cf'/>
<id>urn:sha1:bcd5cff7216f9b2de0a148cc355eac199dc6f1cf</id>
<content type='text'>
There's a lock inversion between the cputimer-&gt;lock and rq-&gt;lock;
notably the two callchains involved are:

 update_rlimit_cpu()
   sighand-&gt;siglock
   set_process_cpu_timer()
     cpu_timer_sample_group()
       thread_group_cputimer()
         cputimer-&gt;lock
         thread_group_cputime()
           task_sched_runtime()
             -&gt;pi_lock
             rq-&gt;lock

 scheduler_tick()
   rq-&gt;lock
   task_tick_fair()
     update_curr()
       account_group_exec()
         cputimer-&gt;lock

Where the first one is enabling a CLOCK_PROCESS_CPUTIME_ID timer, and
the second one is keeping up-to-date.

This problem was introduced by e8abccb7193 ("posix-cpu-timers: Cure
SMP accounting oddities").

Cure the problem by removing the cputimer-&gt;lock and rq-&gt;lock nesting,
this leaves concurrent enablers doing duplicate work, but the time
wasted should be on the same order otherwise wasted spinning on the
lock and the greater-than assignment filter should ensure we preserve
monotonicity.

Reported-by: Dave Jones &lt;davej@redhat.com&gt;
Reported-by: Simon Kirby &lt;sim@hostway.ca&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: stable@kernel.org
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Link: http://lkml.kernel.org/r/1318928713.21167.4.camel@twins
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>Avoid using variable-length arrays in kernel/sys.c</title>
<updated>2011-10-17T15:24:24Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-10-17T15:24:24Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=a84a79e4d369a73c0130b5858199e949432da4c6'/>
<id>urn:sha1:a84a79e4d369a73c0130b5858199e949432da4c6</id>
<content type='text'>
The size is always valid, but variable-length arrays generate worse code
for no good reason (unless the function happens to be inlined and the
compiler sees the length for the simple constant it is).

Also, there seems to be some code generation problem on POWER, where
Henrik Bakken reports that register r28 can get corrupted under some
subtle circumstances (interrupt happening at the wrong time?).  That all
indicates some seriously broken compiler issues, but since variable
length arrays are bad regardless, there's little point in trying to
chase it down.

"Just don't do that, then".

Reported-by: Henrik Grindal Bakken &lt;henribak@cisco.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branches 'irq-urgent-for-linus', 'x86-urgent-for-linus' and 'sched-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip</title>
<updated>2011-10-01T15:37:25Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-10-01T15:37:25Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f72a209a3e694ecb8d3ceed4671d98c4364e00e3'/>
<id>urn:sha1:f72a209a3e694ecb8d3ceed4671d98c4364e00e3</id>
<content type='text'>
* 'irq-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  irq: Fix check for already initialized irq_domain in irq_domain_add
  irq: Add declaration of irq_domain_simple_ops to irqdomain.h

* 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  x86/rtc: Don't recursively acquire rtc_lock

* 'sched-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  posix-cpu-timers: Cure SMP wobbles
  sched: Fix up wchan borkage
  sched/rt: Migrate equal priority tasks to available CPUs
</content>
</entry>
<entry>
<title>posix-cpu-timers: Cure SMP wobbles</title>
<updated>2011-09-30T12:07:06Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-09-01T10:42:04Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d670ec13178d0fd8680e6742a2bc6e04f28f87d8'/>
<id>urn:sha1:d670ec13178d0fd8680e6742a2bc6e04f28f87d8</id>
<content type='text'>
David reported:

  Attached below is a watered-down version of rt/tst-cpuclock2.c from
  GLIBC.  Just build it with "gcc -o test test.c -lpthread -lrt" or
  similar.

  Run it several times, and you will see cases where the main thread
  will measure a process clock difference before and after the nanosleep
  which is smaller than the cpu-burner thread's individual thread clock
  difference.  This doesn't make any sense since the cpu-burner thread
  is part of the top-level process's thread group.

  I've reproduced this on both x86-64 and sparc64 (using both 32-bit and
  64-bit binaries).

  For example:

  [davem@boricha build-x86_64-linux]$ ./test
  process: before(0.001221967) after(0.498624371) diff(497402404)
  thread:  before(0.000081692) after(0.498316431) diff(498234739)
  self:    before(0.001223521) after(0.001240219) diff(16698)
  [davem@boricha build-x86_64-linux]$ 

  The diff of 'process' should always be &gt;= the diff of 'thread'.

  I make sure to wrap the 'thread' clock measurements the most tightly
  around the nanosleep() call, and that the 'process' clock measurements
  are the outer-most ones.

  ---
  #include &lt;unistd.h&gt;
  #include &lt;stdio.h&gt;
  #include &lt;stdlib.h&gt;
  #include &lt;time.h&gt;
  #include &lt;fcntl.h&gt;
  #include &lt;string.h&gt;
  #include &lt;errno.h&gt;
  #include &lt;pthread.h&gt;

  static pthread_barrier_t barrier;

  static void *chew_cpu(void *arg)
  {
	  pthread_barrier_wait(&amp;barrier);
	  while (1)
		  __asm__ __volatile__("" : : : "memory");
	  return NULL;
  }

  int main(void)
  {
	  clockid_t process_clock, my_thread_clock, th_clock;
	  struct timespec process_before, process_after;
	  struct timespec me_before, me_after;
	  struct timespec th_before, th_after;
	  struct timespec sleeptime;
	  unsigned long diff;
	  pthread_t th;
	  int err;

	  err = clock_getcpuclockid(0, &amp;process_clock);
	  if (err)
		  return 1;

	  err = pthread_getcpuclockid(pthread_self(), &amp;my_thread_clock);
	  if (err)
		  return 1;

	  pthread_barrier_init(&amp;barrier, NULL, 2);
	  err = pthread_create(&amp;th, NULL, chew_cpu, NULL);
	  if (err)
		  return 1;

	  err = pthread_getcpuclockid(th, &amp;th_clock);
	  if (err)
		  return 1;

	  pthread_barrier_wait(&amp;barrier);

	  err = clock_gettime(process_clock, &amp;process_before);
	  if (err)
		  return 1;

	  err = clock_gettime(my_thread_clock, &amp;me_before);
	  if (err)
		  return 1;

	  err = clock_gettime(th_clock, &amp;th_before);
	  if (err)
		  return 1;

	  sleeptime.tv_sec = 0;
	  sleeptime.tv_nsec = 500000000;
	  nanosleep(&amp;sleeptime, NULL);

	  err = clock_gettime(th_clock, &amp;th_after);
	  if (err)
		  return 1;

	  err = clock_gettime(my_thread_clock, &amp;me_after);
	  if (err)
		  return 1;

	  err = clock_gettime(process_clock, &amp;process_after);
	  if (err)
		  return 1;

	  diff = process_after.tv_nsec - process_before.tv_nsec;
	  printf("process: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 process_before.tv_sec, process_before.tv_nsec,
		 process_after.tv_sec, process_after.tv_nsec, diff);
	  diff = th_after.tv_nsec - th_before.tv_nsec;
	  printf("thread:  before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 th_before.tv_sec, th_before.tv_nsec,
		 th_after.tv_sec, th_after.tv_nsec, diff);
	  diff = me_after.tv_nsec - me_before.tv_nsec;
	  printf("self:    before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 me_before.tv_sec, me_before.tv_nsec,
		 me_after.tv_sec, me_after.tv_nsec, diff);

	  return 0;
  }

This is due to us using p-&gt;se.sum_exec_runtime in
thread_group_cputime() where we iterate the thread group and sum all
data. This does not take time since the last schedule operation (tick
or otherwise) into account. We can cure this by using
task_sched_runtime() at the cost of having to take locks.

This also means we can (and must) do away with
thread_group_sched_runtime() since the modified thread_group_cputime()
is now more accurate and would deadlock when called from
thread_group_sched_runtime().

Aside of that it makes the function safe on 32 bit systems. The old
code added t-&gt;se.sum_exec_runtime unprotected. sum_exec_runtime is a
64bit value and could be changed on another cpu at the same time.

Reported-by: David Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1314874459.7945.22.camel@twins
Tested-by: David Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
</feed>
