<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/time, branch v2.6.32.41</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/kernel/time?h=v2.6.32.41</id>
<link rel='self' href='https://git.amat.us/linux/atom/kernel/time?h=v2.6.32.41'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2011-05-23T18:20:19Z</updated>
<entry>
<title>tick: Clear broadcast active bit when switching to oneshot</title>
<updated>2011-05-23T18:20:19Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2011-05-16T09:07:48Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f56541cf492165e973eec004e3749f8b595a04d8'/>
<id>urn:sha1:f56541cf492165e973eec004e3749f8b595a04d8</id>
<content type='text'>
commit 07f4beb0b5bbfaf36a64aa00d59e670ec578a95a upstream.

The first cpu which switches from periodic to oneshot mode switches
also the broadcast device into oneshot mode. The broadcast device
serves as a backup for per cpu timers which stop in deeper
C-states. To avoid starvation of the cpus which might be in idle and
depend on broadcast mode it marks the other cpus as broadcast active
and sets the brodcast expiry value of those cpus to the next tick.

The oneshot mode broadcast bit for the other cpus is sticky and gets
only cleared when those cpus exit idle. If a cpu was not idle while
the bit got set in consequence the bit prevents that the broadcast
device is armed on behalf of that cpu when it enters idle for the
first time after it switched to oneshot mode.

In most cases that goes unnoticed as one of the other cpus has usually
a timer pending which keeps the broadcast device armed with a short
timeout. Now if the only cpu which has a short timer active has the
bit set then the broadcast device will not be armed on behalf of that
cpu and will fire way after the expected timer expiry. In the case of
Christians bug report it took ~145 seconds which is about half of the
wrap around time of HPET (the limit for that device) due to the fact
that all other cpus had no timers armed which expired before the 145
seconds timeframe.

The solution is simply to clear the broadcast active bit
unconditionally when a cpu switches to oneshot mode after the first
cpu switched the broadcast device over. It's not idle at that point
otherwise it would not be executing that code.

[ I fundamentally hate that broadcast crap. Why the heck thought some
  folks that when going into deep idle it's a brilliant concept to
  switch off the last device which brings the cpu back from that
  state? ]

Thanks to Christian for providing all the valuable debug information!

Reported-and-tested-by: Christian Hoffmann &lt;email@christianhoffmann.info&gt;
Cc: John Stultz &lt;johnstul@us.ibm.com&gt;
Link: http://lkml.kernel.org/r/%3Calpine.LFD.2.02.1105161105170.3078%40ionos%3E
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>clocksource: Install completely before selecting</title>
<updated>2011-05-23T18:20:19Z</updated>
<author>
<name>john stultz</name>
<email>johnstul@us.ibm.com</email>
</author>
<published>2011-05-05T01:16:50Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=a83b90b7029a52ca073e4efee437ee74853d32aa'/>
<id>urn:sha1:a83b90b7029a52ca073e4efee437ee74853d32aa</id>
<content type='text'>
commit e05b2efb82596905ebfe88e8612ee81dec9b6592 upstream.

Christian Hoffmann reported that the command line clocksource override
with acpi_pm timer fails:

 Kernel command line: &lt;SNIP&gt; clocksource=acpi_pm
 hpet clockevent registered
 Switching to clocksource hpet
 Override clocksource acpi_pm is not HRT compatible.
 Cannot switch while in HRT/NOHZ mode.

The watchdog code is what enables CLOCK_SOURCE_VALID_FOR_HRES, but we
actually end up selecting the clocksource before we enqueue it into
the watchdog list, so that's why we see the warning and fail to switch
to acpi_pm timer as requested. That's particularly bad when we want to
debug timekeeping related problems in early boot.

Put the selection call last.

Reported-by: Christian Hoffmann &lt;email@christianhoffmann.info&gt;
Signed-off-by: John Stultz &lt;johnstul@us.ibm.com&gt;
Link: http://lkml.kernel.org/r/%3C1304558210.2943.24.camel%40work-vm%3E
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>Fix time() inconsistencies caused by intermediate xtime_cache values being read</title>
<updated>2011-05-23T18:20:15Z</updated>
<author>
<name>john stultz</name>
<email>johnstul@us.ibm.com</email>
</author>
<published>2011-05-11T23:10:28Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=2a4027a4f658dee8decf3844354bc192c43b365c'/>
<id>urn:sha1:2a4027a4f658dee8decf3844354bc192c43b365c</id>
<content type='text'>
Currently with 2.6.32-longterm, its possible for time() to occasionally
return values one second earlier then the previous time() call.

This happens because update_xtime_cache() does:
	xtime_cache = xtime;
	timespec_add_ns(&amp;xtime_cache, nsec);

Its possible that xtime is 1sec,999msecs, and nsecs is 1ms, resulting in
a xtime_cache that is 2sec,0ms.

get_seconds() (which is used by sys_time()) does not take the
xtime_lock, which is ok as the xtime.tv_sec value is a long and can be
atomically read safely.

The problem occurs the next call to update_xtime_cache() if xtime has
not increased:
	/* This sets xtime_cache back to 1sec, 999msec */
	xtime_cache = xtime; 
	/* get_seconds, calls here, and sees a 1second inconsistency */
	timespec_add_ns(&amp;xtime_cache, nsec);


In order to resolve this, we could add locking to get_seconds(), but it
needs to be lock free, as it is called from the machine check handler,
opening a possible deadlock.

So instead, this patch introduces an intermediate value for the
calculations, so that we only assign xtime_cache once with the correct
time, using ACCESS_ONCE to make sure the compiler doesn't optimize out
any intermediate values.

The xtime_cache manipulations were removed with 2.6.35, so that kernel
and later do not need this change.

In 2.6.33 and 2.6.34 the logarithmic accumulation should make it so
xtime is updated each tick, so it is unlikely that two updates to
xtime_cache could occur while the difference between xtime and
xtime_cache crosses the second boundary. However, the paranoid might
want to pull this into 2.6.33/34-longterm just to be sure.

Thanks to Stephen for helping finally narrow down the root cause and
many hours of help with testing and validation. Also thanks to Max,
Andi, Eric and Paul for review of earlier attempts and helping clarify
what is possible with regard to out of order execution.

Acked-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: John Stultz &lt;johnstul@us.ibm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>clockevents: Prevent oneshot mode when broadcast device is periodic</title>
<updated>2011-03-07T23:17:54Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2011-02-25T21:34:23Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f3d155634f6fc7ee665731768ed6f85d04ef442c'/>
<id>urn:sha1:f3d155634f6fc7ee665731768ed6f85d04ef442c</id>
<content type='text'>
commit 3a142a0672b48a853f00af61f184c7341ac9c99d upstream.

When the per cpu timer is marked CLOCK_EVT_FEAT_C3STOP, then we only
can switch into oneshot mode, when the backup broadcast device
supports oneshot mode as well. Otherwise we would try to switch the
broadcast device into an unsupported mode unconditionally. This went
unnoticed so far as the current available broadcast devices support
oneshot mode. Seth unearthed this problem while debugging and working
around an hpet related BIOS wreckage.

Add the necessary check to tick_is_oneshot_available().

Reported-and-tested-by: Seth Forshee &lt;seth.forshee@canonical.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
LKML-Reference: &lt;alpine.LFD.2.00.1102252231200.2701@localhost6.localdomain6&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>timekeeping: Fix clock_gettime vsyscall time warp</title>
<updated>2010-08-13T20:20:13Z</updated>
<author>
<name>Lin Ming</name>
<email>ming.m.lin@intel.com</email>
</author>
<published>2009-11-17T05:49:50Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=8aa3149405e33cec4f866cfe7f92c2b40d259613'/>
<id>urn:sha1:8aa3149405e33cec4f866cfe7f92c2b40d259613</id>
<content type='text'>
commit 0696b711e4be45fa104c12329f617beb29c03f78 upstream.

Since commit 0a544198 "timekeeping: Move NTP adjusted clock multiplier
to struct timekeeper" the clock multiplier of vsyscall is updated with
the unmodified clock multiplier of the clock source and not with the
NTP adjusted multiplier of the timekeeper.

This causes user space observerable time warps:
new CLOCK-warp maximum: 120 nsecs,  00000025c337c537 -&gt; 00000025c337c4bf

Add a new argument "mult" to update_vsyscall() and hand in the
timekeeping internal NTP adjusted multiplier.

Signed-off-by: Lin Ming &lt;ming.m.lin@intel.com&gt;
Cc: "Zhang Yanmin" &lt;yanmin_zhang@linux.intel.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
LKML-Reference: &lt;1258436990.17765.83.camel@minggr.sh.intel.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Kurt Garloff &lt;garloff@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>nohz: Reuse ktime in sub-functions of tick_check_idle.</title>
<updated>2010-08-13T20:20:13Z</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2009-09-29T12:25:15Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=e66bb883118bf9db3b66f2ec3315be83d2b27aeb'/>
<id>urn:sha1:e66bb883118bf9db3b66f2ec3315be83d2b27aeb</id>
<content type='text'>
commit eed3b9cf3fe3fcc7a50238dfcab63a63914e8f42 upstream.

On a system with NOHZ=y tick_check_idle calls tick_nohz_stop_idle and
tick_nohz_update_jiffies. Given the right conditions (ts-&gt;idle_active
and/or ts-&gt;tick_stopped) both function get a time stamp with ktime_get.
The same time stamp can be reused if both function require one.

On s390 this change has the additional benefit that gcc inlines the
tick_nohz_stop_idle function into tick_check_idle. The number of
instructions to execute tick_check_idle drops from 225 to 144
(without the ktime_get optimization it is 367 vs 215 instructions).

before:

 0)               |  tick_check_idle() {
 0)               |    tick_nohz_stop_idle() {
 0)               |      ktime_get() {
 0)               |        read_tod_clock() {
 0)   0.601 us    |        }
 0)   1.765 us    |      }
 0)   3.047 us    |    }
 0)               |    ktime_get() {
 0)               |      read_tod_clock() {
 0)   0.570 us    |      }
 0)   1.727 us    |    }
 0)               |    tick_do_update_jiffies64() {
 0)   0.609 us    |    }
 0)   8.055 us    |  }

after:

 0)               |  tick_check_idle() {
 0)               |    ktime_get() {
 0)               |      read_tod_clock() {
 0)   0.617 us    |      }
 0)   1.773 us    |    }
 0)               |    tick_do_update_jiffies64() {
 0)   0.593 us    |    }
 0)   4.477 us    |  }

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: john stultz &lt;johnstul@us.ibm.com&gt;
LKML-Reference: &lt;20090929122533.206589318@de.ibm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: John Jolly &lt;jjolly@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>nohz: Introduce arch_needs_cpu</title>
<updated>2010-08-13T20:20:13Z</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2009-09-29T12:25:16Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=7bfa0a73b2723bf4cc8a154534c351f31f8a120f'/>
<id>urn:sha1:7bfa0a73b2723bf4cc8a154534c351f31f8a120f</id>
<content type='text'>
commit 3c5d92a0cfb5103c0d5ab74d4ae6373d3af38148 upstream.

Allow the architecture to request a normal jiffy tick when the system
goes idle and tick_nohz_stop_sched_tick is called . On s390 the hook is
used to prevent the system going fully idle if there has been an
interrupt other than a clock comparator interrupt since the last wakeup.

On s390 the HiperSockets response time for 1 connection ping-pong goes
down from 42 to 34 microseconds. The CPU cost decreases by 27%.

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
LKML-Reference: &lt;20090929122533.402715150@de.ibm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: John Jolly &lt;jjolly@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>hrtimer: Tune hrtimer_interrupt hang logic</title>
<updated>2010-04-01T22:58:14Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2009-11-13T16:05:44Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=2605c157ed3bb18e66a6a08263448046f77ba5fb'/>
<id>urn:sha1:2605c157ed3bb18e66a6a08263448046f77ba5fb</id>
<content type='text'>
commit 41d2e494937715d3150e5c75d01f0e75ae899337 upstream.

The hrtimer_interrupt hang logic adjusts min_delta_ns based on the
execution time of the hrtimer callbacks.

This is error-prone for virtual machines, where a guest vcpu can be
scheduled out during the execution of the callbacks (and the callbacks
themselves can do operations that translate to blocking operations in
the hypervisor), which in can lead to large min_delta_ns rendering the
system unusable.

Replace the current heuristics with something more reliable. Allow the
interrupt code to try 3 times to catch up with the lost time. If that
fails use the total time spent in the interrupt handler to defer the
next timer interrupt so the system can catch up with other things
which got delayed. Limit that deferment to 100ms.

The retry events and the maximum time spent in the interrupt handler
are recorded and exposed via /proc/timer_list

Inspired by a patch from Marcelo.

Reported-by: Michael Tokarev &lt;mjt@tls.msk.ru&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Cc: kvm@vger.kernel.org
Cc: Jeremy Fitzhardinge &lt;jeremy@goop.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>timekeeping: Prevent oops when GENERIC_TIME=n</title>
<updated>2010-04-01T22:58:06Z</updated>
<author>
<name>john stultz</name>
<email>johnstul@us.ibm.com</email>
</author>
<published>2010-03-01T20:34:43Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=74b176077272028d0a88eddd59e630abc37cd475'/>
<id>urn:sha1:74b176077272028d0a88eddd59e630abc37cd475</id>
<content type='text'>
commit ad6759fbf35d104dbf573cd6f4c6784ad6823f7e upstream.

Aaro Koskinen reported an issue in kernel.org bugzilla #15366, where
on non-GENERIC_TIME systems, accessing
/sys/devices/system/clocksource/clocksource0/current_clocksource
results in an oops.

It seems the timekeeper/clocksource rework missed initializing the
curr_clocksource value in the !GENERIC_TIME case.

Thanks to Aaro for reporting and diagnosing the issue as well as
testing the fix!

Reported-by: Aaro Koskinen &lt;aaro.koskinen@iki.fi&gt;
Signed-off-by: John Stultz &lt;johnstul@us.ibm.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
LKML-Reference: &lt;1267475683.4216.61.camel@localhost.localdomain&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>Export the symbol of getboottime and mmonotonic_to_bootbased</title>
<updated>2010-02-23T15:37:52Z</updated>
<author>
<name>Jason Wang</name>
<email>jasowang@redhat.com</email>
</author>
<published>2010-01-27T11:13:40Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=1c63c20663a84b03a2dc47d1c1998f1883081162'/>
<id>urn:sha1:1c63c20663a84b03a2dc47d1c1998f1883081162</id>
<content type='text'>
commit c93d89f3dbf0202bf19c07960ca8602b48c2f9a0 upstream.

Export getboottime and monotonic_to_bootbased in order to let them
could be used by following patch.

Signed-off-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
</feed>
