<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs/proc/stat.c, branch v3.4.19</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/fs/proc/stat.c?h=v3.4.19</id>
<link rel='self' href='https://git.amat.us/linux/atom/fs/proc/stat.c?h=v3.4.19'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2012-10-28T17:14:12Z</updated>
<entry>
<title>nohz: Fix idle ticks in cpu summary line of /proc/stat</title>
<updated>2012-10-28T17:14:12Z</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.cz</email>
</author>
<published>2012-10-10T06:21:09Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d39bd2701f0f414aba1376ae61202b526ec7e128'/>
<id>urn:sha1:d39bd2701f0f414aba1376ae61202b526ec7e128</id>
<content type='text'>
commit 7386cdbf2f57ea8cff3c9fde93f206e58b9fe13f upstream.

Git commit 09a1d34f8535ecf9 "nohz: Make idle/iowait counter update
conditional" introduced a bug in regard to cpu hotplug. The effect is
that the number of idle ticks in the cpu summary line in /proc/stat is
still counting ticks for offline cpus.

Reproduction is easy, just start a workload that keeps all cpus busy,
switch off one or more cpus and then watch the idle field in top.
On a dual-core with one cpu 100% busy and one offline cpu you will get
something like this:

%Cpu(s): 48.7 us,  1.3 sy,  0.0 ni, 50.0 id,  0.0 wa,  0.0 hi,  0.0 si,
%0.0 st

The problem is that an offline cpu still has ts-&gt;idle_active == 1.
To fix this we should make sure that the cpu is online when calling
get_cpu_idle_time_us and get_cpu_iowait_time_us.

[Srivatsa: Rebased to current mainline]

Reported-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Reviewed-by: Srivatsa S. Bhat &lt;srivatsa.bhat@linux.vnet.ibm.com&gt;
Signed-off-by: Srivatsa S. Bhat &lt;srivatsa.bhat@linux.vnet.ibm.com&gt;
Link: http://lkml.kernel.org/r/20121010061820.8999.57245.stgit@srivatsabhat.in.ibm.com
Cc: deepthi@linux.vnet.ibm.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>proc: stats: Use arch_idle_time for idle and iowait times if available</title>
<updated>2012-03-30T13:43:33Z</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2012-03-30T10:23:08Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=cb85a6ed67e979c59a29b7b4e8217e755b951cf4'/>
<id>urn:sha1:cb85a6ed67e979c59a29b7b4e8217e755b951cf4</id>
<content type='text'>
Git commit a25cac5198d4ff28 "proc: Consider NO_HZ when printing idle and
iowait times" changes the code for /proc/stat to use get_cpu_idle_time_us
and get_cpu_iowait_time_us if the system is running with nohz enabled.
For architectures which define arch_idle_time (currently s390 only)
this is a change for the worse. The result of arch_idle_time is supposed
to be the exact sleep time of the target cpu and should be used instead
of the value kept by the scheduler.

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Reviewed-by: Srivatsa S. Bhat &lt;srivatsa.bhat@linux.vnet.ibm.com&gt;
Link: http://lkml.kernel.org/r/20120330122308.18720283@de.ibm.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>procfs: add num_to_str() to speed up /proc/stat</title>
<updated>2012-03-23T23:58:42Z</updated>
<author>
<name>KAMEZAWA Hiroyuki</name>
<email>kamezawa.hiroyu@jp.fujitsu.com</email>
</author>
<published>2012-03-23T22:02:54Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=1ac101a5d675aca2426c5cd460c73fb95acb8391'/>
<id>urn:sha1:1ac101a5d675aca2426c5cd460c73fb95acb8391</id>
<content type='text'>
== stat_check.py
num = 0
with open("/proc/stat") as f:
        while num &lt; 1000 :
                data = f.read()
                f.seek(0, 0)
                num = num + 1
==

perf shows

    20.39%  stat_check.py  [kernel.kallsyms]    [k] format_decode
    13.41%  stat_check.py  [kernel.kallsyms]    [k] number
    12.61%  stat_check.py  [kernel.kallsyms]    [k] vsnprintf
    10.85%  stat_check.py  [kernel.kallsyms]    [k] memcpy
     4.85%  stat_check.py  [kernel.kallsyms]    [k] radix_tree_lookup
     4.43%  stat_check.py  [kernel.kallsyms]    [k] seq_printf

This patch removes most of calls to vsnprintf() by adding num_to_str()
and seq_print_decimal_ull(), which prints decimal numbers without rich
functions provided by printf().

On my 8cpu box.
== Before patch ==
[root@bluextal test]# time ./stat_check.py

real    0m0.150s
user    0m0.026s
sys     0m0.121s

== After patch ==
[root@bluextal test]# time ./stat_check.py

real    0m0.055s
user    0m0.022s
sys     0m0.030s

[akpm@linux-foundation.org: remove incorrect comment, use less statck in num_to_str(), move comment from .h to .c, simplify seq_put_decimal_ull()]
[andrea@betterlinux.com: avoid breaking the ABI in /proc/stat]
Signed-off-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Signed-off-by: Andrea Righi &lt;andrea@betterlinux.com&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Glauber Costa &lt;glommer@parallels.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>proc: speed up /proc/stat handling</title>
<updated>2012-03-23T23:58:42Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2012-03-23T22:02:53Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=59a32e2ce5eb809967cac4e718bc527beca83c59'/>
<id>urn:sha1:59a32e2ce5eb809967cac4e718bc527beca83c59</id>
<content type='text'>
On a typical 16 cpus machine, "cat /proc/stat" gives more than 4096 bytes,
and is slow :

  # strace -T -o /tmp/STRACE cat /proc/stat | wc -c
  5826
  # grep "cpu " /tmp/STRACE
  read(0, "cpu  1949310 19 2144714 12117253"..., 32768) = 5826 &lt;0.001504&gt;

Thats partly because show_stat() must be called twice since initial
buffer size is too small (4096 bytes for less than 32 possible cpus)

Fix this by :

 1) Taking into account nr_irqs in the initial buffer sizing.

 2) Using ksize() to allow better filling of initial buffer.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Glauber Costa &lt;glommer@parallels.com&gt;
Cc: Russell King - ARM Linux &lt;linux@arm.linux.org.uk&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>sched/accounting, proc: Fix /proc/stat interrupts sum</title>
<updated>2012-01-16T07:13:27Z</updated>
<author>
<name>Russell King</name>
<email>rmk+kernel@arm.linux.org.uk</email>
</author>
<published>2012-01-14T00:01:01Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f7e6746ebae984ea67b0a1a1e23c7e6698240631'/>
<id>urn:sha1:f7e6746ebae984ea67b0a1a1e23c7e6698240631</id>
<content type='text'>
Commit 3292beb340c7688 ("sched/accounting: Change cpustat fields to an array")
deleted the code which provides us with the sum of all interrupts in the
system, causing vmstat to report zero interrupts occuring in the system.

Fix this by restoring the code.

Signed-off-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
Tested-by: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt; # [on ARM]
Tested-by: Tony Luck &lt;tony.luck@intel.com&gt;
Tested-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Glauber Costa &lt;glommer@parallels.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Paul Tuner &lt;pjt@google.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2012-01-06T16:44:54Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-01-06T16:33:28Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0db49b72bce26341274b74fd968501489a361ae3'/>
<id>urn:sha1:0db49b72bce26341274b74fd968501489a361ae3</id>
<content type='text'>
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
  sched/tracing: Add a new tracepoint for sleeptime
  sched: Disable scheduler warnings during oopses
  sched: Fix cgroup movement of waking process
  sched: Fix cgroup movement of newly created process
  sched: Fix cgroup movement of forking process
  sched: Remove cfs bandwidth period check in tg_set_cfs_period()
  sched: Fix load-balance lock-breaking
  sched: Replace all_pinned with a generic flags field
  sched: Only queue remote wakeups when crossing cache boundaries
  sched: Add missing rcu_dereference() around -&gt;real_parent usage
  [S390] fix cputime overflow in uptime_proc_show
  [S390] cputime: add sparse checking and cleanup
  sched: Mark parent and real_parent as __rcu
  sched, nohz: Fix missing RCU read lock
  sched, nohz: Set the NOHZ_BALANCE_KICK flag for idle load balancer
  sched, nohz: Fix the idle cpu check in nohz_idle_balance
  sched: Use jump_labels for sched_feat
  sched/accounting: Fix parameter passing in task_group_account_field
  sched/accounting: Fix user/system tick double accounting
  sched/accounting: Re-use scheduler statistics for the root cgroup
  ...

Fix up conflicts in
 - arch/ia64/include/asm/cputime.h, include/asm-generic/cputime.h
	usecs_to_cputime64() vs the sparse cleanups
 - kernel/sched/fair.c, kernel/time/tick-sched.c
	scheduler changes in multiple branches
</content>
</entry>
<entry>
<title>procfs: do not confuse jiffies with cputime64_t</title>
<updated>2011-12-30T00:31:57Z</updated>
<author>
<name>Andreas Schwab</name>
<email>schwab@linux-m68k.org</email>
</author>
<published>2011-12-28T23:57:15Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=34845636a184f3be91a531098192592cbe6db587'/>
<id>urn:sha1:34845636a184f3be91a531098192592cbe6db587</id>
<content type='text'>
Commit 2a95ea6c0d129b4 ("procfs: do not overflow get_{idle,iowait}_time
for nohz") did not take into account that one some architectures jiffies
and cputime use different units.

This causes get_idle_time() to return numbers in the wrong units, making
the idle time fields in /proc/stat wrong.

Instead of converting the usec value returned by
get_cpu_{idle,iowait}_time_us to units of jiffies, use the new function
usecs_to_cputime64 to convert it to the correct unit of cputime64_t.

Signed-off-by: Andreas Schwab &lt;schwab@linux-m68k.org&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: "Artem S. Tashkinov" &lt;t.artem@mailcity.com&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge commit 'v3.2-rc5' into sched/core</title>
<updated>2011-12-15T07:21:30Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2011-12-15T07:21:21Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=6a54aebf6978e9f296a4d3da3e40af425163c22e'/>
<id>urn:sha1:6a54aebf6978e9f296a4d3da3e40af425163c22e</id>
<content type='text'>
Merge reason: Pick up the latest fixes.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>procfs: do not overflow get_{idle,iowait}_time for nohz</title>
<updated>2011-12-09T15:50:29Z</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.cz</email>
</author>
<published>2011-12-08T22:34:32Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=2a95ea6c0d129b4568fb64e1deda16ceb20e6636'/>
<id>urn:sha1:2a95ea6c0d129b4568fb64e1deda16ceb20e6636</id>
<content type='text'>
Since commit a25cac5198d4 ("proc: Consider NO_HZ when printing idle and
iowait times") we are reporting idle/io_wait time also while a CPU is
tickless.  We rely on get_{idle,iowait}_time functions to retrieve
proper data.

These functions, however, use usecs_to_cputime to translate micro
seconds time to cputime64_t.  This is just an alias to usecs_to_jiffies
which reduces the data type from u64 to unsigned int and also checks
whether the given parameter overflows jiffies_to_usecs(MAX_JIFFY_OFFSET)
and returns MAX_JIFFY_OFFSET in that case.

When we overflow depends on CONFIG_HZ but especially for CONFIG_HZ_300
it is quite low (1431649781) so we are getting MAX_JIFFY_OFFSET for
&gt;3000s! until we overflow unsigned int.  Just for reference
CONFIG_HZ_100 has an overflow window around 20s, CONFIG_HZ_250 ~8s and
CONFIG_HZ_1000 ~2s.

This results in a bug when people saw [h]top going mad reporting 100%
CPU usage even though there was basically no CPU load.  The reason was
simply that /proc/stat stopped reporting idle/io_wait changes (and
reported MAX_JIFFY_OFFSET) and so the only change happening was for user
system time.

Let's use nsecs_to_jiffies64 instead which doesn't reduce the precision
to 32b type and it is much more appropriate for cumulative time values
(unlike usecs_to_jiffies which intended for timeout calculations).

Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Tested-by: Artem S. Tashkinov &lt;t.artem@mailcity.com&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>sched/accounting: Change cpustat fields to an array</title>
<updated>2011-12-06T08:06:38Z</updated>
<author>
<name>Glauber Costa</name>
<email>glommer@parallels.com</email>
</author>
<published>2011-11-28T16:45:17Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=3292beb340c76884427faa1f5d6085719477d889'/>
<id>urn:sha1:3292beb340c76884427faa1f5d6085719477d889</id>
<content type='text'>
This patch changes fields in cpustat from a structure, to an
u64 array. Math gets easier, and the code is more flexible.

Signed-off-by: Glauber Costa &lt;glommer@parallels.com&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Paul Tuner &lt;pjt@google.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/1322498719-2255-2-git-send-email-glommer@parallels.com
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
</feed>
