aboutsummaryrefslogtreecommitdiff
path: root/kernel/sched_debug.c
AgeCommit message (Collapse)Author
2008-11-10sched: clean up debug infoPeter Zijlstra
Impact: clean up and fix debug info printout While looking over the sched_debug code I noticed that we printed the rq schedstats for every cfs_rq, ammend this. Also change nr_spead_over into an int, and fix a little buglet in min_vruntime printing. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-30sched: change sched_debug's mode to 0444Li Zefan
Impact: change /proc/sched/debug from rw-r--r-- to r--r--r-- /proc/sched_debug is read-only. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-10[PATCH] signal, procfs: some lock_task_sighand() users do not need ↵Lai Jiangshan
rcu_read_lock() lock_task_sighand() make sure task->sighand is being protected, so we do not need rcu_read_lock(). [ exec() will get task->sighand->siglock before change task->sighand! ] But code using rcu_read_lock() _just_ to protect lock_task_sighand() only appear in procfs. (and some code in procfs use lock_task_sighand() without such redundant protection.) Other subsystem may put lock_task_sighand() into rcu_read_lock() critical region, but these rcu_read_lock() are used for protecting "for_each_process()", "find_task_by_vpid()" etc. , not for protecting lock_task_sighand(). Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> [ok from Oleg] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2008-06-27sched: add full schedstats to /proc/sched_debugPeter Zijlstra
show all the schedstats in /debug/sched_debug as well. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-27sched: revert revert of: fair-group: SMP-nice for group schedulingPeter Zijlstra
Try again.. Initial commit: 18d95a2832c1392a2d63227a7a6d433cb9f2037e Revert: 6363ca57c76b7b83639ca8c83fc285fa26a7880e Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-20sched: debug: add some rt debug outputPeter Zijlstra
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "Daniel K." <dk@uw.no> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-29revert ("sched: fair-group: SMP-nice for group scheduling")Ingo Molnar
Yanmin Zhang reported: Comparing with 2.6.25, volanoMark has big regression with kernel 2.6.26-rc1. It's about 50% on my 8-core stoakley, 16-core tigerton, and Itanium Montecito. With bisect, I located the following patch: | 18d95a2832c1392a2d63227a7a6d433cb9f2037e is first bad commit | commit 18d95a2832c1392a2d63227a7a6d433cb9f2037e | Author: Peter Zijlstra <a.p.zijlstra@chello.nl> | Date: Sat Apr 19 19:45:00 2008 +0200 | | sched: fair-group: SMP-nice for group scheduling Revert it so that we get v2.6.25 behavior. Bisected-by: Yanmin Zhang <yanmin_zhang@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-05sched: add optional support for CONFIG_HAVE_UNSTABLE_SCHED_CLOCKPeter Zijlstra
this replaces the rq->clock stuff (and possibly cpu_clock()). - architectures that have an 'imperfect' hardware clock can set CONFIG_HAVE_UNSTABLE_SCHED_CLOCK - the 'jiffie' window might be superfulous when we update tick_gtod before the __update_sched_clock() call in sched_clock_tick() - cpu_clock() might be implemented as: sched_clock_cpu(smp_processor_id()) if the accuracy proves good enough - how far can TSC drift in a single jiffie when considering the filtering and idle hooks? [ mingo@elte.hu: various fixes and cleanups ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-01rename div64_64 to div64_u64Roman Zippel
Rename div64_64 to div64_u64 to make it consistent with the other divide functions, so it clearly includes the type of the divide. Move its definition to math64.h as currently no architecture overrides the generic implementation. They can still override it of course, but the duplicated declarations are avoided. Signed-off-by: Roman Zippel <zippel@linux-m68k.org> Cc: Avi Kivity <avi@qumranet.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Howells <dhowells@redhat.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29kernel: use non-racy method for proc entries creationDenis V. Lunev
Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data be setup before gluing PDE to main tree. Signed-off-by: Denis V. Lunev <den@openvz.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-19sched: build fixIngo Molnar
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19sched: debug: add some debug code to handle the full hierarchyPeter Zijlstra
Add some extra debug output so we can get a better overview of the full hierarchy. We print the cgroup path after each cfs_rq, so we can see what group we're looking at. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19sched: remove sysctl_sched_batch_wakeup_granularityIngo Molnar
it's unused. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-19sched: improve affine wakeupsIngo Molnar
improve affine wakeups. Maintain the 'overlap' metric based on CFS's sum_exec_runtime - which means the amount of time a task executes after it wakes up some other task. Use the 'overlap' for the wakeup decisions: if the 'overlap' is short, it means there's strong workload coupling between this task and the woken up task. If the 'overlap' is large then the workload is decoupled and the scheduler will move them to separate CPUs more easily. ( Also slightly move the preempt_check within try_to_wake_up() - this has no effect on functionality but allows 'early wakeups' (for still-on-rq tasks) to be correctly accounted as well.) Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25sched: keep total / count stats in addition to the max forArjan van de Ven
Right now, the linux kernel (with scheduler statistics enabled) keeps track of the maximum time a process is waiting to be scheduled. While the maximum is a very useful metric, tracking average and total is equally useful (at least for latencytop) to figure out the accumulated effect of scheduler delays. The accumulated effect is important to judge the performance impact of scheduler tuning/behavior. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25sched: monitor clock underflows in /proc/sched_debugGuillaume Chazarain
We monitor clock overflows, let's also monitor clock underflows. Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-12-30sched: fix gcc warningsIngo Molnar
Meelis Roos reported these warnings on sparc64: CC kernel/sched.o In file included from kernel/sched.c:879: kernel/sched_debug.c: In function 'nsec_high': kernel/sched_debug.c:38: warning: comparison of distinct pointer types lacks a cast the debug check in do_div() is over-eager here, because the long long is always positive in these places. Mark this by casting them to unsigned long long. no change in code output: text data bss dec hex filename 51471 6582 376 58429 e43d sched.o.before 51471 6582 376 58429 e43d sched.o.after md5: 7f7729c111f185bf3ccea4d542abc049 sched.o.before.asm 7f7729c111f185bf3ccea4d542abc049 sched.o.after.asm Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-28sched: clean up overlong line in kernel/sched_debug.cIngo Molnar
clean up overlong line in kernel/sched_debug.c. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-26sched: bump version of kernel/sched_debug.cIngo Molnar
bump version of kernel/sched_debug.c and remove CFS version information from it. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-11-09sched: reintroduce the sched_min_granularity tunablePeter Zijlstra
we lost the sched_min_granularity tunable to a clever optimization that uses the sched_latency/min_granularity ratio - but the ratio is quite unintuitive to users and can also crash the kernel if the ratio is set to 0. So reintroduce the min_granularity tunable, while keeping the ratio maintained internally. no functionality changed. [ mingo@elte.hu: some fixlets. ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-25sched: fix unconditional irq lockPeter Zijlstra
Lockdep noticed that this lock can also be taken from hardirq context, and can thus not unconditionally disable/enable irqs. WARNING: at kernel/lockdep.c:2033 trace_hardirqs_on() [show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30 [show_trace+18/32] show_trace+0x12/0x20 [dump_stack+22/32] dump_stack+0x16/0x20 [trace_hardirqs_on+405/416] trace_hardirqs_on+0x195/0x1a0 [_read_unlock_irq+34/48] _read_unlock_irq+0x22/0x30 [sched_debug_show+2615/4224] sched_debug_show+0xa37/0x1080 [show_state_filter+326/368] show_state_filter+0x146/0x170 [sysrq_handle_showstate+10/16] sysrq_handle_showstate+0xa/0x10 [__handle_sysrq+123/288] __handle_sysrq+0x7b/0x120 [handle_sysrq+40/64] handle_sysrq+0x28/0x40 [kbd_event+1045/1680] kbd_event+0x415/0x690 [input_pass_event+206/208] input_pass_event+0xce/0xd0 [input_handle_event+170/928] input_handle_event+0xaa/0x3a0 [input_event+95/112] input_event+0x5f/0x70 [atkbd_interrupt+434/1456] atkbd_interrupt+0x1b2/0x5b0 [serio_interrupt+59/128] serio_interrupt+0x3b/0x80 [i8042_interrupt+263/576] i8042_interrupt+0x107/0x240 [handle_IRQ_event+40/96] handle_IRQ_event+0x28/0x60 [handle_edge_irq+175/320] handle_edge_irq+0xaf/0x140 [do_IRQ+64/128] do_IRQ+0x40/0x80 [common_interrupt+46/52] common_interrupt+0x2e/0x34 Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-18sched: reduce schedstat variable overhead a bitKen Chen
schedstat is useful in investigating CPU scheduler behavior. Ideally, I think it is beneficial to have it on all the time. However, the cost of turning it on in production system is quite high, largely due to number of events it collects and also due to its large memory footprint. Most of the fields probably don't need to be full 64-bit on 64-bit arch. Rolling over 4 billion events will most like take a long time and user space tool can be made to accommodate that. I'm proposing kernel to cut back most of variable width on 64-bit system. (note, the following patch doesn't affect 32-bit system). Signed-off-by: Ken Chen <kenchen@google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-15Make scheduler debug file operations constArjan van de Ven
In general, struct file_operations are const in the kernel, to not have false cacheline sharing and to catch bugs at compiletime with accidental writes to them. The new scheduler code introduces a new non-const one; fix this up. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-15sched: debug, improve migration statisticsIngo Molnar
add new migration statistics when SCHED_DEBUG and SCHEDSTATS is enabled. Available in /proc/<PID>/sched. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-15sched: debug: increase width of debug lineIngo Molnar
increase width of debug line - in preparation of more debugging info. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-15sched: group scheduling, sysfs tunablesDhaval Giani
Add tunables in sysfs to modify a user's cpu share. A directory is created in sysfs for each new user in the system. /sys/kernel/uids/<uid>/cpu_share Reading this file returns the cpu shares granted for the user. Writing into this file modifies the cpu share for the user. Only an administrator is allowed to modify a user's cpu share. Ex: # cd /sys/kernel/uids/ # cat 512/cpu_share 1024 # echo 2048 > 512/cpu_share # cat 512/cpu_share 2048 # Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-15sched: cleanup: rename task_grp to task_groupIngo Molnar
cleanup: rename task_grp to task_group. No need to save two characters and 'grp' is annoying to read. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-15sched: group scheduler, fix coding style issuesSrivatsa Vaddagiri
Fix coding style issues reported by Randy Dunlap and others Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched: speed up and simplify vslice calculationsPeter Zijlstra
speed up and simplify vslice calculations. [ From: Mike Galbraith <efault@gmx.de>: build fix ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-15sched: clean up schedstats, cnt -> countIngo Molnar
rename all 'cnt' fields and variables to the less yucky 'count' name. yuckage noticed by Andrew Morton. no change in code, other than the /proc/sched_debug bkl_count string got a bit larger: text data bss dec hex filename 38236 3506 24 41766 a326 sched.o.before 38240 3506 24 41770 a32a sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched debug: check spreadPeter Zijlstra
debug feature: check how well we schedule within a reasonable vruntime 'spread' range. (note that CPU overload can increase the spread, so this is not a hard condition, but normal loads should be within the spread.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2007-10-15sched debug: more width for parameter printoutsIngo Molnar
more width for parameter printouts in /proc/sched_debug. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched debug: print settingsIngo Molnar
print the current value of all tunables in /proc/sched_debug output. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched debug: BKL usage statistics, fixS.Caglar Onur
build fix for the SCHED_DEBUG && !SCHEDSTATS case. Signed-off-by: S.Ceglar Onur <caglar@pardus.org.tr> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched debug: BKL usage statisticsIngo Molnar
add per task and per rq BKL usage statistics. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched: add fair-user schedulerSrivatsa Vaddagiri
Enable user-id based fair group scheduling. This is useful for anyone who wants to test the group scheduler w/o having to enable CONFIG_CGROUPS. A separate scheduling group (i.e struct task_grp) is automatically created for every new user added to the system. Upon uid change for a task, it is made to move to the corresponding scheduling group. A /proc tunable (/proc/root_user_share) is also provided to tune root user's quota of cpu bandwidth. Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched: print nr_running and load in /proc/sched_debugSrivatsa Vaddagiri
- print nr_running and load information for cfs_rq in /proc/sched_debug Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched: fix formatting of /proc/sched_debugMike Galbraith
fix formatting of /proc/sched_debug Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched: enhance debug outputIngo Molnar
enhance debug output by changing 12345678 nsecs to 12.345678 output, this is more human-readable. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched: prettify /proc/sched_debug outputIngo Molnar
print the correct amount of dashes in /proc/sched_debug. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched: do not keep current in the tree and get rid of sched_entity::fair_keyDmitry Adamushko
Get rid of 'sched_entity::fair_key'. As a side effect, 'current' is not kept withing the tree for SCHED_NORMAL/BATCH tasks anymore. This simplifies some parts of code (e.g. entity_tick() and yield_task_fair()) and also somewhat optimizes them (e.g. a single update_curr() now vs. dequeue/enqueue() before in entity_tick()). Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched: remove wait_runtime fields and featuresIngo Molnar
remove wait_runtime based fields and features, now that the CFS math has been changed over to the vruntime metric. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched: remove wait_runtime limitIngo Molnar
remove the wait_runtime-limit fields and the code depending on it, now that the math has been changed over to rely on the vruntime metric. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched: clean up struct load_statDmitry Adamushko
'struct load_stat' is redundant now so let's get rid of it. Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched: add more vruntime statisticsIngo Molnar
add more vruntime statistics. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched: add se->vruntime debuggingIngo Molnar
debug se->vruntime fields. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de>
2007-10-15sched: remove precise CPU loadIngo Molnar
CPU load calculations are statistical anyway, and there's little benefit from having it calculated on every scheduling event. So remove this code, it gets rid of a divide from the scheduler wakeup and context-switch fastpath. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched: debug: track maximum 'slice'Ingo Molnar
track the maximum amount of time a task has executed while the CPU load was at least 2x. (i.e. at least two nice-0 tasks were runnable) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-15sched: small sched_debug cleanupIngo Molnar
small kernel/sched_debug.c cleanup - break up multi-variable assignment. no code changed: text data bss dec hex filename 38869 3550 24 42443 a5cb sched.o.before 38869 3550 24 42443 a5cb sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2007-09-05sched: debug: fix sum_exec_runtime clearingIngo Molnar
when cleaning sched-stats also clear prev_sum_exec_runtime. Signed-off-by: Ingo Molnar <mingo@elte.hu>