<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sched/rt.c, branch v3.12.14</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/kernel/sched/rt.c?h=v3.12.14</id>
<link rel='self' href='https://git.amat.us/linux/atom/kernel/sched/rt.c?h=v3.12.14'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2014-01-09T20:25:10Z</updated>
<entry>
<title>sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities</title>
<updated>2014-01-09T20:25:10Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>tkhai@yandex.ru</email>
</author>
<published>2013-11-27T15:59:13Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0bff3e480b75e408444ad5f0d8bd15a407f4a6cb'/>
<id>urn:sha1:0bff3e480b75e408444ad5f0d8bd15a407f4a6cb</id>
<content type='text'>
commit 757dfcaa41844595964f1220f1d33182dae49976 upstream.

This patch touches the RT group scheduling case.

Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's
priority, while rt_rq passed to them may be not the top-level rt_rq.
This is wrong, because changing of priority on a child level does not
guarantee that the priority is the highest all over the rq. So, this
leak makes RT balancing unusable.

The short example: the task having the highest priority among all rq's
RT tasks (no one other task has the same priority) are waking on a
throttle rt_rq.  The rq's cpupri is set to the task's priority
equivalent, but real rq-&gt;rt.highest_prio.curr is less.

The patch below fixes the problem.

Signed-off-by: Kirill Tkhai &lt;tkhai@yandex.ru&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
CC: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Link: http://lkml.kernel.org/r/49231385567953@web4m.yandex.ru
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sched/rt: Simplify pull_rt_task() logic and remove .leaf_rt_rq_list</title>
<updated>2013-06-19T10:58:40Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>tkhai@yandex.ru</email>
</author>
<published>2013-06-07T19:37:43Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=e23ee74777f389369431d77390c4b09332ce026a'/>
<id>urn:sha1:e23ee74777f389369431d77390c4b09332ce026a</id>
<content type='text'>
[ Peter, this is based off of some of my work, I ran it though a few
  tests and it passed. I also reviewed it, and added my SOB as I am
  somewhat a co-author to it. ]

Based on the patch by Steven Rostedt from previous year:

https://lkml.org/lkml/2012/4/18/517

1)Simplify pull_rt_task() logic: search in pushable tasks of dest runqueue.
The only pullable tasks are the tasks which are pushable in their local rq,
and no others.

2)Remove .leaf_rt_rq_list member of struct rt_rq and functions connected
with it: nobody uses it since now.

Signed-off-by: Kirill Tkhai &lt;tkhai@yandex.ru&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/287571370557898@web7d.yandex.ru
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched: Use an accessor to read the rq clock</title>
<updated>2013-05-28T07:40:27Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2013-04-11T23:51:02Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=78becc27097585c6aec7043834cadde950ae79f2'/>
<id>urn:sha1:78becc27097585c6aec7043834cadde950ae79f2</id>
<content type='text'>
Read the runqueue clock through an accessor. This
prepares for adding a debugging infrastructure to
detect missing or redundant calls to update_rq_clock()
between a scheduler's entry and exit point.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Li Zhong &lt;zhong@linux.vnet.ibm.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1365724262-20142-6-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched: Remove redundant update_runtime notifier</title>
<updated>2013-05-28T07:40:22Z</updated>
<author>
<name>Neil Zhang</name>
<email>zhangwm@marvell.com</email>
</author>
<published>2013-04-11T13:04:59Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=c5405a495e88d93cf9b4f4cc91507c7f4afcb901'/>
<id>urn:sha1:c5405a495e88d93cf9b4f4cc91507c7f4afcb901</id>
<content type='text'>
migration_call() will do all the things that update_runtime() does.
So let's remove it.

Furthermore, there is potential risk that the current code will catch
BUG_ON at line 689 of rt.c when do cpu hotplug while there are realtime
threads running because of enabling runtime twice while the rt_runtime
may already changed.

Signed-off-by: Neil Zhang &lt;zhangwm@marvell.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1365685499-26515-1-git-send-email-zhangwm@marvell.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched: Use this_rq() helper</title>
<updated>2013-05-10T08:35:56Z</updated>
<author>
<name>Nathan Zimmer</name>
<email>nzimmer@sgi.com</email>
</author>
<published>2013-05-09T16:24:03Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=424c93fe4cbe719e7fd7169248d2b648c493b68d'/>
<id>urn:sha1:424c93fe4cbe719e7fd7169248d2b648c493b68d</id>
<content type='text'>
It is a few instructions more efficent to and slightly more
readable to use this_rq()-&gt; instead of cpu_rq(smp_processor_id())-&gt; .

Size comparison of kernel/sched/fair.o:

   text    data     bss     dec     hex filename
  27972     122      26   28120    6dd8 fair.o.before
  27956     122      26   28104    6dc8 fair.o.after

Signed-off-by: Nathan Zimmer &lt;nzimmer@sgi.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1368116643-87971-1-git-send-email-nzimmer@sgi.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2013-02-20T02:19:48Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-02-20T02:19:48Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d652e1eb8e7b739fccbfb503a3da3e9f640fbf3d'/>
<id>urn:sha1:d652e1eb8e7b739fccbfb503a3da3e9f640fbf3d</id>
<content type='text'>
Pull scheduler changes from Ingo Molnar:
 "Main changes:

   - scheduler side full-dynticks (user-space execution is undisturbed
     and receives no timer IRQs) preparation changes that convert the
     cputime accounting code to be full-dynticks ready, from Frederic
     Weisbecker.

   - Initial sched.h split-up changes, by Clark Williams

   - select_idle_sibling() performance improvement by Mike Galbraith:

        " 1 tbench pair (worst case) in a 10 core + SMT package:

          pre   15.22 MB/sec 1 procs
          post 252.01 MB/sec 1 procs "

  - sched_rr_get_interval() ABI fix/change.  We think this detail is not
    used by apps (so it's not an ABI in practice), but lets keep it
    under observation.

  - misc RT scheduling cleanups, optimizations"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  sched/rt: Add &lt;linux/sched/rt.h&gt; header to &lt;linux/init_task.h&gt;
  cputime: Remove irqsave from seqlock readers
  sched, powerpc: Fix sched.h split-up build failure
  cputime: Restore CPU_ACCOUNTING config defaults for PPC64
  sched/rt: Move rt specific bits into new header file
  sched/rt: Add a tuning knob to allow changing SCHED_RR timeslice
  sched: Move sched.h sysctl bits into separate header
  sched: Fix signedness bug in yield_to()
  sched: Fix select_idle_sibling() bouncing cow syndrome
  sched/rt: Further simplify pick_rt_task()
  sched/rt: Do not account zero delta_exec in update_curr_rt()
  cputime: Safely read cputime of full dynticks CPUs
  kvm: Prepare to add generic guest entry/exit callbacks
  cputime: Use accessors to read task cputime stats
  cputime: Allow dynamic switch between tick/virtual based cputime accounting
  cputime: Generic on-demand virtual cputime accounting
  cputime: Move default nsecs_to_cputime() to jiffies based cputime file
  cputime: Librarize per nsecs resolution cputime definitions
  cputime: Avoid multiplication overflow on utime scaling
  context_tracking: Export context state for generic vtime
  ...

Fix up conflict in kernel/context_tracking.c due to comment additions.
</content>
</entry>
<entry>
<title>sched/rt: Add a tuning knob to allow changing SCHED_RR timeslice</title>
<updated>2013-02-07T19:51:07Z</updated>
<author>
<name>Clark Williams</name>
<email>williams@redhat.com</email>
</author>
<published>2013-02-07T15:47:04Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=ce0dbbbb30aee6a835511d5be446462388ba9eee'/>
<id>urn:sha1:ce0dbbbb30aee6a835511d5be446462388ba9eee</id>
<content type='text'>
Add a /proc/sys/kernel scheduler knob named
sched_rr_timeslice_ms that allows global changing of the
SCHED_RR timeslice value. User visable value is in milliseconds
but is stored as jiffies.  Setting to 0 (zero) resets to the
default (currently 100ms).

Signed-off-by: Clark Williams &lt;williams@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Link: http://lkml.kernel.org/r/20130207094704.13751796@riff.lan
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/rt: Further simplify pick_rt_task()</title>
<updated>2013-02-03T18:54:58Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>tkhai@yandex.ru</email>
</author>
<published>2013-01-31T14:56:17Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=60334caf37dc7c59120b21faa625534a6fffead0'/>
<id>urn:sha1:60334caf37dc7c59120b21faa625534a6fffead0</id>
<content type='text'>
Function next_prio() has been removed and pull_rt_task() is the
only user of pick_next_highest_task_rt() at the moment.

pull_rt_task is not interested in p-&gt;nr_cpus_allowed, its only
interest is the fact that cpu is allowed to execute p. If
nr_cpus_allowed == 1, cpu != task_cpu(p) and cpu is allowed then
it means that task p is in the middle of the migration
techniques; the task waits until it is moved by migration
thread. So, lets pull it earlier.

Signed-off-by: Kirill V Tkhai &lt;tkhai@yandex.ru&gt;
Acked-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
CC: linux-rt-users &lt;linux-rt-users@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/70871359644177@web16d.yandex.ru
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/rt: Do not account zero delta_exec in update_curr_rt()</title>
<updated>2013-01-31T09:31:13Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>tkhai@yandex.ru</email>
</author>
<published>2013-01-30T12:50:36Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=fc79e240be5aa379dd36a62158be5a5ee0e4aec7'/>
<id>urn:sha1:fc79e240be5aa379dd36a62158be5a5ee0e4aec7</id>
<content type='text'>
There are several places of consecutive calls of
dequeue_task_rt() and put_prev_task_rt() in the scheduler.
For example, function rt_mutex_setprio() does it.

The both calls lead to update_curr_rt(), the second of it
receives zeroed delta_exec. The only effective action in this
case is call of sched_rt_avg_update(), which can change
rq-&gt;age_stamp and rq-&gt;rt_avg. But it is possible in case of
""floating"" rq-&gt;clock. This fact is not reasonable to be
accounted. Another actions do nothing.

Signed-off-by: Kirill V Tkhai &lt;tkhai@yandex.ru&gt;
Acked-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
CC: linux-rt-users &lt;linux-rt-users@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/931541359550236@web1g.yandex.ru
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/rt: Avoid updating RT entry timeout twice within one tick period</title>
<updated>2013-01-25T07:31:54Z</updated>
<author>
<name>Ying Xue</name>
<email>ying.xue@windriver.com</email>
</author>
<published>2012-07-17T07:03:43Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=57d2aa00dcec67afa52478730f2b524521af14fb'/>
<id>urn:sha1:57d2aa00dcec67afa52478730f2b524521af14fb</id>
<content type='text'>
The issue below was found in 2.6.34-rt rather than mainline rt
kernel, but the issue still exists upstream as well.

So please let me describe how it was noticed on 2.6.34-rt:

On this version, each softirq has its own thread, it means there
is at least one RT FIFO task per cpu. The priority of these
tasks is set to 49 by default. If user launches an RT FIFO task
with priority lower than 49 of softirq RT tasks, it's possible
there are two RT FIFO tasks enqueued one cpu runqueue at one
moment. By current strategy of balancing RT tasks, when it comes
to RT tasks, we really need to put them off to a CPU that they
can run on as soon as possible. Even if it means a bit of cache
line flushing, we want RT tasks to be run with the least latency.

When the user RT FIFO task which just launched before is
running, the sched timer tick of the current cpu happens. In this
tick period, the timeout value of the user RT task will be
updated once. Subsequently, we try to wake up one softirq RT
task on its local cpu. As the priority of current user RT task
is lower than the softirq RT task, the current task will be
preempted by the higher priority softirq RT task. Before
preemption, we check to see if current can readily move to a
different cpu. If so, we will reschedule to allow the RT push logic
to try to move current somewhere else. Whenever the woken
softirq RT task runs, it first tries to migrate the user FIFO RT
task over to a cpu that is running a task of lesser priority. If
migration is done, it will send a reschedule request to the found
cpu by IPI interrupt. Once the target cpu responds the IPI
interrupt, it will pick the migrated user RT task to preempt its
current task. When the user RT task is running on the new cpu,
the sched timer tick of the cpu fires. So it will tick the user
RT task again. This also means the RT task timeout value will be
updated again. As the migration may be done in one tick period,
it means the user RT task timeout value will be updated twice
within one tick.

If we set a limit on the amount of cpu time for the user RT task
by setrlimit(RLIMIT_RTTIME), the SIGXCPU signal should be posted
upon reaching the soft limit.

But exactly when the SIGXCPU signal should be sent depends on the
RT task timeout value. In fact the timeout mechanism of sending
the SIGXCPU signal assumes the RT task timeout is increased once
every tick.

However, currently the timeout value may be added twice per
tick. So it results in the SIGXCPU signal being sent earlier
than expected.

To solve this issue, we prevent the timeout value from increasing
twice within one tick time by remembering the jiffies value of
last updating the timeout. As long as the RT task's jiffies is
different with the global jiffies value, we allow its timeout to
be updated.

Signed-off-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Fan Du &lt;fan.du@windriver.com&gt;
Reviewed-by: Yong Zhang &lt;yong.zhang0@gmail.com&gt;
Acked-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1342508623-2887-1-git-send-email-ying.xue@windriver.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
