<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/block/elevator.c, branch v3.12.14</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/block/elevator.c?h=v3.12.14</id>
<link rel='self' href='https://git.amat.us/linux/atom/block/elevator.c?h=v3.12.14'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2013-12-08T15:29:16Z</updated>
<entry>
<title>elevator: acquire q-&gt;sysfs_lock in elevator_change()</title>
<updated>2013-12-08T15:29:16Z</updated>
<author>
<name>Tomoki Sekiyama</name>
<email>tomoki.sekiyama@hds.com</email>
</author>
<published>2013-10-15T22:42:19Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=fc655139edb1557a66fefe945ceb61715b18f929'/>
<id>urn:sha1:fc655139edb1557a66fefe945ceb61715b18f929</id>
<content type='text'>
commit 7c8a3679e3d8e9d92d58f282161760a0e247df97 upstream.

Add locking of q-&gt;sysfs_lock into elevator_change() (an exported function)
to ensure it is held to protect q-&gt;elevator from elevator_init(), even if
elevator_change() is called from non-sysfs paths.
sysfs path (elv_iosched_store) uses __elevator_change(), non-locking
version, as the lock is already taken by elv_iosched_store().

Signed-off-by: Tomoki Sekiyama &lt;tomoki.sekiyama@hds.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Josh Boyer &lt;jwboyer@fedoraproject.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>elevator: Fix a race in elevator switching and md device initialization</title>
<updated>2013-12-08T15:29:16Z</updated>
<author>
<name>Tomoki Sekiyama</name>
<email>tomoki.sekiyama@hds.com</email>
</author>
<published>2013-10-15T22:42:16Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d6a5267e403642c8802e5bc59cf754ba44a8993b'/>
<id>urn:sha1:d6a5267e403642c8802e5bc59cf754ba44a8993b</id>
<content type='text'>
commit eb1c160b22655fd4ec44be732d6594fd1b1e44f4 upstream.

The soft lockup below happens at the boot time of the system using dm
multipath and the udev rules to switch scheduler.

[  356.127001] BUG: soft lockup - CPU#3 stuck for 22s! [sh:483]
[  356.127001] RIP: 0010:[&lt;ffffffff81072a7d&gt;]  [&lt;ffffffff81072a7d&gt;] lock_timer_base.isra.35+0x1d/0x50
...
[  356.127001] Call Trace:
[  356.127001]  [&lt;ffffffff81073810&gt;] try_to_del_timer_sync+0x20/0x70
[  356.127001]  [&lt;ffffffff8118b08a&gt;] ? kmem_cache_alloc_node_trace+0x20a/0x230
[  356.127001]  [&lt;ffffffff810738b2&gt;] del_timer_sync+0x52/0x60
[  356.127001]  [&lt;ffffffff812ece22&gt;] cfq_exit_queue+0x32/0xf0
[  356.127001]  [&lt;ffffffff812c98df&gt;] elevator_exit+0x2f/0x50
[  356.127001]  [&lt;ffffffff812c9f21&gt;] elevator_change+0xf1/0x1c0
[  356.127001]  [&lt;ffffffff812caa50&gt;] elv_iosched_store+0x20/0x50
[  356.127001]  [&lt;ffffffff812d1d09&gt;] queue_attr_store+0x59/0xb0
[  356.127001]  [&lt;ffffffff812143f6&gt;] sysfs_write_file+0xc6/0x140
[  356.127001]  [&lt;ffffffff811a326d&gt;] vfs_write+0xbd/0x1e0
[  356.127001]  [&lt;ffffffff811a3ca9&gt;] SyS_write+0x49/0xa0
[  356.127001]  [&lt;ffffffff8164e899&gt;] system_call_fastpath+0x16/0x1b

This is caused by a race between md device initialization by multipathd and
shell script to switch the scheduler using sysfs.

 - multipathd:
   SyS_ioctl -&gt; do_vfs_ioctl -&gt; dm_ctl_ioctl -&gt; ctl_ioctl -&gt; table_load
   -&gt; dm_setup_md_queue -&gt; blk_init_allocated_queue -&gt; elevator_init
    q-&gt;elevator = elevator_alloc(q, e); // not yet initialized

 - sh -c 'echo deadline &gt; /sys/$DEVPATH/queue/scheduler':
   elevator_switch (in the call trace above)
    struct elevator_queue *old = q-&gt;elevator;
    q-&gt;elevator = elevator_alloc(q, new_e);
    elevator_exit(old);                 // lockup! (*)

 - multipathd: (cont.)
    err = e-&gt;ops.elevator_init_fn(q);   // init fails; q-&gt;elevator is modified

(*) When del_timer_sync() is called, lock_timer_base() will loop infinitely
while timer-&gt;base == NULL. In this case, as timer will never initialized,
it results in lockup.

This patch introduces acquisition of q-&gt;sysfs_lock around elevator_init()
into blk_init_allocated_queue(), to provide mutual exclusion between
initialization of the q-&gt;scheduler and switching of the scheduler.

This should fix this bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=902012

Signed-off-by: Tomoki Sekiyama &lt;tomoki.sekiyama@hds.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Josh Boyer &lt;jwboyer@fedoraproject.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>block: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...)</title>
<updated>2013-09-11T19:22:03Z</updated>
<author>
<name>Joe Perches</name>
<email>joe@perches.com</email>
</author>
<published>2013-08-29T22:21:42Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=c1b511eb211a6c72d66f7755d2b30a9a91ef9423'/>
<id>urn:sha1:c1b511eb211a6c72d66f7755d2b30a9a91ef9423</id>
<content type='text'>
Use the helper function instead of __GFP_ZERO.

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>elevator: Fix a race in elevator switching</title>
<updated>2013-07-03T11:25:24Z</updated>
<author>
<name>Jianpeng Ma</name>
<email>majianpeng@gmail.com</email>
</author>
<published>2013-07-03T11:25:24Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d50235b7bc3ee0a0427984d763ea7534149531b4'/>
<id>urn:sha1:d50235b7bc3ee0a0427984d763ea7534149531b4</id>
<content type='text'>
There's a race between elevator switching and normal io operation.
    Because the allocation of struct elevator_queue and struct elevator_data
    don't in a atomic operation.So there are have chance to use NULL
    -&gt;elevator_data.
    For example:
        Thread A:                               Thread B
        blk_queu_bio                            elevator_switch
        spin_lock_irq(q-&gt;queue_block)           elevator_alloc
        elv_merge                               elevator_init_fn

    Because call elevator_alloc, it can't hold queue_lock and the
    -&gt;elevator_data is NULL.So at the same time, threadA call elv_merge and
    nedd some info of elevator_data.So the crash happened.

    Move the elevator_alloc into func elevator_init_fn, it make the
    operations in a atomic operation.

    Using the follow method can easy reproduce this bug
    1:dd if=/dev/sdb of=/dev/null
    2:while true;do echo noop &gt; scheduler;echo deadline &gt; scheduler;done

    The test method also use this method.

Signed-off-by: Jianpeng Ma &lt;majianpeng@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: implement runtime pm strategy</title>
<updated>2013-03-23T04:22:15Z</updated>
<author>
<name>Lin Ming</name>
<email>ming.m.lin@intel.com</email>
</author>
<published>2013-03-23T03:42:27Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=c8158819d506a8aedeca53c52dfb709a0aabe011'/>
<id>urn:sha1:c8158819d506a8aedeca53c52dfb709a0aabe011</id>
<content type='text'>
When a request is added:
    If device is suspended or is suspending and the request is not a
    PM request, resume the device.

When the last request finishes:
    Call pm_runtime_mark_last_busy().

When pick a request:
    If device is resuming/suspending, then only PM request is allowed
    to go.

The idea and API is designed by Alan Stern and described here:
http://marc.info/?l=linux-scsi&amp;m=133727953625963&amp;w=2

Signed-off-by: Lin Ming &lt;ming.m.lin@intel.com&gt;
Signed-off-by: Aaron Lu &lt;aaron.lu@intel.com&gt;
Acked-by: Alan Stern &lt;stern@rowland.harvard.edu&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-3.9/core' of git://git.kernel.dk/linux-block</title>
<updated>2013-02-28T20:52:24Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-02-28T20:52:24Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=ee89f81252179dcbf6cd65bd48299f5e52292d88'/>
<id>urn:sha1:ee89f81252179dcbf6cd65bd48299f5e52292d88</id>
<content type='text'>
Pull block IO core bits from Jens Axboe:
 "Below are the core block IO bits for 3.9.  It was delayed a few days
  since my workstation kept crashing every 2-8h after pulling it into
  current -git, but turns out it is a bug in the new pstate code (divide
  by zero, will report separately).  In any case, it contains:

   - The big cfq/blkcg update from Tejun and and Vivek.

   - Additional block and writeback tracepoints from Tejun.

   - Improvement of the should sort (based on queues) logic in the plug
     flushing.

   - _io() variants of the wait_for_completion() interface, using
     io_schedule() instead of schedule() to contribute to io wait
     properly.

   - Various little fixes.

  You'll get two trivial merge conflicts, which should be easy enough to
  fix up"

Fix up the trivial conflicts due to hlist traversal cleanups (commit
b67bfe0d42ca: "hlist: drop the node parameter from iterators").

* 'for-3.9/core' of git://git.kernel.dk/linux-block: (39 commits)
  block: remove redundant check to bd_openers()
  block: use i_size_write() in bd_set_size()
  cfq: fix lock imbalance with failed allocations
  drivers/block/swim3.c: fix null pointer dereference
  block: don't select PERCPU_RWSEM
  block: account iowait time when waiting for completion of IO request
  sched: add wait_for_completion_io[_timeout]
  writeback: add more tracepoints
  block: add block_{touch|dirty}_buffer tracepoint
  buffer: make touch_buffer() an exported function
  block: add @req to bio_{front|back}_merge tracepoints
  block: add missing block_bio_complete() tracepoint
  block: Remove should_sort judgement when flush blk_plug
  block,elevator: use new hashtable implementation
  cfq-iosched: add hierarchical cfq_group statistics
  cfq-iosched: collect stats from dead cfqgs
  cfq-iosched: separate out cfqg_stats_reset() from cfq_pd_reset_stats()
  blkcg: make blkcg_print_blkgs() grab q locks instead of blkcg lock
  block: RCU free request_queue
  blkcg: implement blkg_[rw]stat_recursive_sum() and blkg_[rw]stat_merge()
  ...
</content>
</entry>
<entry>
<title>hlist: drop the node parameter from iterators</title>
<updated>2013-02-28T03:10:24Z</updated>
<author>
<name>Sasha Levin</name>
<email>sasha.levin@oracle.com</email>
</author>
<published>2013-02-28T01:06:00Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=b67bfe0d42cac56c512dd5da4b1b347a23f4b70a'/>
<id>urn:sha1:b67bfe0d42cac56c512dd5da4b1b347a23f4b70a</id>
<content type='text'>
I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj-&gt;member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    &lt;+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+&gt;

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin &lt;peter.senna@gmail.com&gt;
Acked-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Cc: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Cc: Gleb Natapov &lt;gleb@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>block: don't request module during elevator init</title>
<updated>2013-01-23T00:48:03Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2013-01-23T00:48:03Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=21c3c5d2800733b7a276725b8e1ae49a694adc1a'/>
<id>urn:sha1:21c3c5d2800733b7a276725b8e1ae49a694adc1a</id>
<content type='text'>
Block layer allows selecting an elevator which is built as a module to
be selected as system default via kernel param "elevator=".  This is
achieved by automatically invoking request_module() whenever a new
block device is initialized and the elevator is not available.

This led to an interesting deadlock problem involving async and module
init.  Block device probing running off an async job invokes
request_module().  While the module is being loaded, it performs
async_synchronize_full() which ends up waiting for the async job which
is already waiting for request_module() to finish, leading to
deadlock.

Invoking request_module() from deep in block device init path is
already nasty in itself.  It seems best to avoid these situations from
the beginning by moving on-demand module loading out of block init
path.

The previous patch made sure that the default elevator module is
loaded early during boot if available.  This patch removes on-demand
loading of the default elevator from elevator init path.  As the
module would have been loaded during boot, userland-visible behavior
difference should be minimal.

For more details, please refer to the following thread.

  http://thread.gmane.org/gmane.linux.kernel/1420814

v2: The bool parameter was named @request_module which conflicted with
    request_module().  This built okay w/ CONFIG_MODULES because
    request_module() was defined as a macro.  W/o CONFIG_MODULES, it
    causes build breakage.  Rename the parameter to @try_loading.
    Reported by Fengguang.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Alex Riesen &lt;raa.lkml@gmail.com&gt;
Cc: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
</content>
</entry>
<entry>
<title>init, block: try to load default elevator module early during boot</title>
<updated>2013-01-18T22:05:56Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2013-01-18T22:05:56Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=bb813f4c933ae9f887a014483690d5f8b8ec05e1'/>
<id>urn:sha1:bb813f4c933ae9f887a014483690d5f8b8ec05e1</id>
<content type='text'>
This patch adds default module loading and uses it to load the default
block elevator.  During boot, it's called right after initramfs or
initrd is made available and right before control is passed to
userland.  This ensures that as long as the modules are available in
the usual places in initramfs, initrd or the root filesystem, the
default modules are loaded as soon as possible.

This will replace the on-demand elevator module loading from elevator
init path.

v2: Fixed build breakage when !CONFIG_BLOCK.  Reported by kbuild test
    robot.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Alex Riesen &lt;raa.lkml@gmail.com&gt;
Cc: Fengguang We &lt;fengguang.wu@intel.com&gt;
</content>
</entry>
<entry>
<title>block,elevator: use new hashtable implementation</title>
<updated>2013-01-11T13:43:13Z</updated>
<author>
<name>Sasha Levin</name>
<email>sasha.levin@oracle.com</email>
</author>
<published>2012-12-17T15:01:27Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=242d98f077ac0ab80920219769eb095503b93f61'/>
<id>urn:sha1:242d98f077ac0ab80920219769eb095503b93f61</id>
<content type='text'>
Switch elevator to use the new hashtable implementation. This reduces the
amount of generic unrelated code in the elevator.

This also removes the dymanic allocation of the hash table. The size of the table is
constant so there's no point in paying the price of an extra dereference when accessing
it.

This patch depends on d9b482c ("hashtable: introduce a small and naive
hashtable") which was merged in v3.6.

Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
