<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sched/core.c, branch v3.4.19</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/kernel/sched/core.c?h=v3.4.19</id>
<link rel='self' href='https://git.amat.us/linux/atom/kernel/sched/core.c?h=v3.4.19'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2012-10-12T20:38:57Z</updated>
<entry>
<title>CPU hotplug, cpusets, suspend: Don't modify cpusets during suspend/resume</title>
<updated>2012-10-12T20:38:57Z</updated>
<author>
<name>Srivatsa S. Bhat</name>
<email>srivatsa.bhat@linux.vnet.ibm.com</email>
</author>
<published>2012-05-24T14:16:26Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=c62f9945efea31db203fd4fb77e830ddffdcabf6'/>
<id>urn:sha1:c62f9945efea31db203fd4fb77e830ddffdcabf6</id>
<content type='text'>
commit d35be8bab9b0ce44bed4b9453f86ebf64062721e upstream.

In the event of CPU hotplug, the kernel modifies the cpusets' cpus_allowed
masks as and when necessary to ensure that the tasks belonging to the cpusets
have some place (online CPUs) to run on. And regular CPU hotplug is
destructive in the sense that the kernel doesn't remember the original cpuset
configurations set by the user, across hotplug operations.

However, suspend/resume (which uses CPU hotplug) is a special case in which
the kernel has the responsibility to restore the system (during resume), to
exactly the same state it was in before suspend.

In order to achieve that, do the following:

1. Don't modify cpusets during suspend/resume. At all.
   In particular, don't move the tasks from one cpuset to another, and
   don't modify any cpuset's cpus_allowed mask. So, simply ignore cpusets
   during the CPU hotplug operations that are carried out in the
   suspend/resume path.

2. However, cpusets and sched domains are related. We just want to avoid
   altering cpusets alone. So, to keep the sched domains updated, build
   a single sched domain (containing all active cpus) during each of the
   CPU hotplug operations carried out in s/r path, effectively ignoring
   the cpusets' cpus_allowed masks.

   (Since userspace is frozen while doing all this, it will go unnoticed.)

3. During the last CPU online operation during resume, build the sched
   domains by looking up the (unaltered) cpusets' cpus_allowed masks.
   That will bring back the system to the same original state as it was in
   before suspend.

Ultimately, this will not only solve the cpuset problem related to suspend
resume (ie., restores the cpusets to exactly what it was before suspend, by
not touching it at all) but also speeds up suspend/resume because we avoid
running cpuset update code for every CPU being offlined/onlined.

Signed-off-by: Srivatsa S. Bhat &lt;srivatsa.bhat@linux.vnet.ibm.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/20120524141611.3692.20155.stgit@srivatsabhat.in.ibm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Preeti U Murthy &lt;preeti@linux.vnet.ibm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sched: Fix race in task_group()</title>
<updated>2012-10-02T17:30:35Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2012-06-22T11:36:05Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=e194fab8d7ebe95db5609f1cb6794c2afcc3118f'/>
<id>urn:sha1:e194fab8d7ebe95db5609f1cb6794c2afcc3118f</id>
<content type='text'>
commit 8323f26ce3425460769605a6aece7a174edaa7d1 upstream.

Stefan reported a crash on a kernel before a3e5d1091c1 ("sched:
Don't call task_group() too many times in set_task_rq()"), he
found the reason to be that the multiple task_group()
invocations in set_task_rq() returned different values.

Looking at all that I found a lack of serialization and plain
wrong comments.

The below tries to fix it using an extra pointer which is
updated under the appropriate scheduler locks. Its not pretty,
but I can't really see another way given how all the cgroup
stuff works.

Reported-and-tested-by: Stefan Bader &lt;stefan.bader@canonical.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/1340364965.18025.71.camel@twins
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sched: fix divide by zero at {thread_group,task}_times</title>
<updated>2012-09-14T17:00:20Z</updated>
<author>
<name>Stanislaw Gruszka</name>
<email>sgruszka@redhat.com</email>
</author>
<published>2012-08-08T09:27:15Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=ebad30a797a5e5093d69e39bc046edab9452fe00'/>
<id>urn:sha1:ebad30a797a5e5093d69e39bc046edab9452fe00</id>
<content type='text'>
commit bea6832cc8c4a0a9a65dd17da6aaa657fe27bc3e upstream.

On architectures where cputime_t is 64 bit type, is possible to trigger
divide by zero on do_div(temp, (__force u32) total) line, if total is a
non zero number but has lower 32 bit's zeroed. Removing casting is not
a good solution since some do_div() implementations do cast to u32
internally.

This problem can be triggered in practice on very long lived processes:

  PID: 2331   TASK: ffff880472814b00  CPU: 2   COMMAND: "oraagent.bin"
   #0 [ffff880472a51b70] machine_kexec at ffffffff8103214b
   #1 [ffff880472a51bd0] crash_kexec at ffffffff810b91c2
   #2 [ffff880472a51ca0] oops_end at ffffffff814f0b00
   #3 [ffff880472a51cd0] die at ffffffff8100f26b
   #4 [ffff880472a51d00] do_trap at ffffffff814f03f4
   #5 [ffff880472a51d60] do_divide_error at ffffffff8100cfff
   #6 [ffff880472a51e00] divide_error at ffffffff8100be7b
      [exception RIP: thread_group_times+0x56]
      RIP: ffffffff81056a16  RSP: ffff880472a51eb8  RFLAGS: 00010046
      RAX: bc3572c9fe12d194  RBX: ffff880874150800  RCX: 0000000110266fad
      RDX: 0000000000000000  RSI: ffff880472a51eb8  RDI: 001038ae7d9633dc
      RBP: ffff880472a51ef8   R8: 00000000b10a3a64   R9: ffff880874150800
      R10: 00007fcba27ab680  R11: 0000000000000202  R12: ffff880472a51f08
      R13: ffff880472a51f10  R14: 0000000000000000  R15: 0000000000000007
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   #7 [ffff880472a51f00] do_sys_times at ffffffff8108845d
   #8 [ffff880472a51f40] sys_times at ffffffff81088524
   #9 [ffff880472a51f80] system_call_fastpath at ffffffff8100b0f2
      RIP: 0000003808caac3a  RSP: 00007fcba27ab6d8  RFLAGS: 00000202
      RAX: 0000000000000064  RBX: ffffffff8100b0f2  RCX: 0000000000000000
      RDX: 00007fcba27ab6e0  RSI: 000000000076d58e  RDI: 00007fcba27ab6e0
      RBP: 00007fcba27ab700   R8: 0000000000000020   R9: 000000000000091b
      R10: 00007fcba27ab680  R11: 0000000000000202  R12: 00007fff9ca41940
      R13: 0000000000000000  R14: 00007fcba27ac9c0  R15: 00007fff9ca41940
      ORIG_RAX: 0000000000000064  CS: 0033  SS: 002b

Signed-off-by: Stanislaw Gruszka &lt;sgruszka@redhat.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/20120808092714.GA3580@redhat.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sched,cgroup: Fix up task_groups list</title>
<updated>2012-09-14T17:00:20Z</updated>
<author>
<name>Mike Galbraith</name>
<email>efault@gmx.de</email>
</author>
<published>2012-08-07T03:00:13Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0f342b96d471d48fb151ae9073e58e0510e0a58f'/>
<id>urn:sha1:0f342b96d471d48fb151ae9073e58e0510e0a58f</id>
<content type='text'>
commit 35cf4e50b16331def6cfcbee11e49270b6db07f5 upstream.

With multiple instances of task_groups, for_each_rt_rq() is a noop,
no task groups having been added to the rt.c list instance.  This
renders __enable/disable_runtime() and print_rt_stats() noop, the
user (non) visible effect being that rt task groups are missing in
/proc/sched_debug.

Signed-off-by: Mike Galbraith &lt;efault@gmx.de&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/1344308413.6846.7.camel@marge.simpson.net
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sched/nohz: Rewrite and fix load-avg computation -- again</title>
<updated>2012-07-19T15:58:56Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2012-06-22T13:52:09Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=7490d0a4cfefa16f9d8ce636eb5b2e13d2432db3'/>
<id>urn:sha1:7490d0a4cfefa16f9d8ce636eb5b2e13d2432db3</id>
<content type='text'>
commit 5167e8d5417bf5c322a703d2927daec727ea40dd upstream.

Thanks to Charles Wang for spotting the defects in the current code:

 - If we go idle during the sample window -- after sampling, we get a
   negative bias because we can negate our own sample.

 - If we wake up during the sample window we get a positive bias
   because we push the sample to a known active period.

So rewrite the entire nohz load-avg muck once again, now adding
copious documentation to the code.

Reported-and-tested-by: Doug Smythies &lt;dsmythies@telus.net&gt;
Reported-and-tested-by: Charles Wang &lt;muming.wq@gmail.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/1340373782.18025.74.camel@twins
[ minor edits ]
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sched: Fix the relax_domain_level boot parameter</title>
<updated>2012-06-17T18:21:28Z</updated>
<author>
<name>Dimitri Sivanich</name>
<email>sivanich@sgi.com</email>
</author>
<published>2012-06-05T18:44:36Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=8997b2223b9d81f6085764d08f9de3a2da760333'/>
<id>urn:sha1:8997b2223b9d81f6085764d08f9de3a2da760333</id>
<content type='text'>
commit a841f8cef4bb124f0f5563314d0beaf2e1249d72 upstream.

It does not get processed because sched_domain_level_max is 0 at the
time that setup_relax_domain_level() is run.

Simply accept the value as it is, as we don't know the value of
sched_domain_level_max until sched domain construction is completed.

Fix sched_relax_domain_level in cpuset.  The build_sched_domain() routine calls
the set_domain_attribute() routine prior to setting the sd-&gt;level, however,
the set_domain_attribute() routine relies on the sd-&gt;level to decide whether
idle load balancing will be off/on.

Signed-off-by: Dimitri Sivanich &lt;sivanich@sgi.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/20120605184436.GA15668@sgi.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sched: Fix KVM and ia64 boot crash due to sched_groups circular linked list assumption</title>
<updated>2012-05-09T10:27:35Z</updated>
<author>
<name>Igor Mammedov</name>
<email>imammedo@redhat.com</email>
</author>
<published>2012-05-09T10:38:28Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=30b4e9eb783d94e9f5d503b15eb31720679ae1c7'/>
<id>urn:sha1:30b4e9eb783d94e9f5d503b15eb31720679ae1c7</id>
<content type='text'>
If we have one cpu that failed to boot and boot cpu gave up on
waiting for it and then another cpu is being booted, kernel
might crash with following OOPS:

   BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
   IP: [&lt;ffffffff812c3630&gt;] __bitmap_weight+0x30/0x80
   Call Trace:
       [&lt;ffffffff8108b9b6&gt;] build_sched_domains+0x7b6/0xa50

The crash happens in init_sched_groups_power() that expects
sched_groups to be circular linked list. However it is not
always true, since sched_groups preallocated in __sdt_alloc are
initialized in build_sched_groups and it may exit early

        if (cpu != cpumask_first(sched_domain_span(sd)))
                return 0;

without initializing sd-&gt;groups-&gt;next field.

Fix bug by initializing next field right after sched_group was
allocated.

Also-Reported-by: Jiang Liu &lt;liuj97@gmail.com&gt;
Signed-off-by: Igor Mammedov &lt;imammedo@redhat.com&gt;
Cc: a.p.zijlstra@chello.nl
Cc: pjt@google.com
Cc: seto.hidetoshi@jp.fujitsu.com
Link: http://lkml.kernel.org/r/1336559908-32533-1-git-send-email-imammedo@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched: Fix OOPS when build_sched_domains() percpu allocation fails</title>
<updated>2012-04-26T10:54:53Z</updated>
<author>
<name>he, bo</name>
<email>bo.he@intel.com</email>
</author>
<published>2012-04-25T11:59:21Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=fb2cf2c660971bea0ad86a9a5c19ad39eab61344'/>
<id>urn:sha1:fb2cf2c660971bea0ad86a9a5c19ad39eab61344</id>
<content type='text'>
Under extreme memory used up situations, percpu allocation
might fail. We hit it when system goes to suspend-to-ram,
causing a kworker panic:

 EIP: [&lt;c124411a&gt;] build_sched_domains+0x23a/0xad0
 Kernel panic - not syncing: Fatal exception
 Pid: 3026, comm: kworker/u:3
 3.0.8-137473-gf42fbef #1

 Call Trace:
  [&lt;c18cc4f2&gt;] panic+0x66/0x16c
  [...]
  [&lt;c1244c37&gt;] partition_sched_domains+0x287/0x4b0
  [&lt;c12a77be&gt;] cpuset_update_active_cpus+0x1fe/0x210
  [&lt;c123712d&gt;] cpuset_cpu_inactive+0x1d/0x30
  [...]

With this fix applied build_sched_domains() will return -ENOMEM and
the suspend attempt fails.

Signed-off-by: he, bo &lt;bo.he@intel.com&gt;
Reviewed-by: Zhang, Yanmin &lt;yanmin.zhang@intel.com&gt;
Reviewed-by: Srivatsa S. Bhat &lt;srivatsa.bhat@linux.vnet.ibm.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: &lt;stable@kernel.org&gt;
Link: http://lkml.kernel.org/r/1335355161.5892.17.camel@hebo
[ So, we fail to deallocate a CPU because we cannot allocate RAM :-/
  I don't like that kind of sad behavior but nevertheless it should
  not crash under high memory load. ]
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2012-03-31T20:35:31Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-03-31T20:35:31Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f22e08a79f3765fecf060b225a46931c94fb0a92'/>
<id>urn:sha1:f22e08a79f3765fecf060b225a46931c94fb0a92</id>
<content type='text'>
Pull scheduler fixes from Ingo Molnar.

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Fix incorrect usage of for_each_cpu_mask() in select_fallback_rq()
  sched: Fix __schedule_bug() output when called from an interrupt
  sched/arch: Introduce the finish_arch_post_lock_switch() scheduler callback
</content>
</entry>
<entry>
<title>sched: Fix incorrect usage of for_each_cpu_mask() in select_fallback_rq()</title>
<updated>2012-03-31T08:43:36Z</updated>
<author>
<name>Srivatsa S. Bhat</name>
<email>srivatsa.bhat@linux.vnet.ibm.com</email>
</author>
<published>2012-03-30T14:10:28Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=e3831edd59edf57ca11fc289f08961b20baf5146'/>
<id>urn:sha1:e3831edd59edf57ca11fc289f08961b20baf5146</id>
<content type='text'>
The function for_each_cpu_mask() expects a *pointer* to struct
cpumask as its second argument, whereas select_fallback_rq()
passes the value itself.

And moreover, for_each_cpu_mask() has been marked as obselete
in include/linux/cpumask.h. So move to the more appropriate
for_each_cpu() variant.

Reported-by: Sasha Levin &lt;levinsasha928@gmail.com&gt;
Signed-off-by: Srivatsa S. Bhat &lt;srivatsa.bhat@linux.vnet.ibm.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Cc: Liu Chuansheng &lt;chuansheng.liu@intel.com&gt;
Cc: vapier@gentoo.org
Cc: rusty@rustcorp.com.au
Link: http://lkml.kernel.org/r/4F75BED4.9050005@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
