<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/block, branch v3.16</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/block?h=v3.16</id>
<link rel='self' href='https://git.amat.us/linux/atom/block?h=v3.16'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2014-07-14T10:27:20Z</updated>
<entry>
<title>block: provide compat ioctl for BLKZEROOUT</title>
<updated>2014-07-14T10:27:20Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2014-07-02T16:46:23Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=3b3a1814d1703027f9867d0f5cbbfaf6c7482474'/>
<id>urn:sha1:3b3a1814d1703027f9867d0f5cbbfaf6c7482474</id>
<content type='text'>
This patch provides the compat BLKZEROOUT ioctl. The argument is a pointer
to two uint64_t values, so there is no need to translate it.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Cc: stable@vger.kernel.org	# 3.7+
Acked-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blkcg: don't call into policy draining if root_blkg is already gone</title>
<updated>2014-07-12T15:55:10Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2014-07-05T22:43:21Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0b462c89e31f7eb6789713437eb551833ee16ff3'/>
<id>urn:sha1:0b462c89e31f7eb6789713437eb551833ee16ff3</id>
<content type='text'>
While a queue is being destroyed, all the blkgs are destroyed and its
-&gt;root_blkg pointer is set to NULL.  If someone else starts to drain
while the queue is in this state, the following oops happens.

  NULL pointer dereference at 0000000000000028
  IP: [&lt;ffffffff8144e944&gt;] blk_throtl_drain+0x84/0x230
  PGD e4a1067 PUD b773067 PMD 0
  Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched]
  CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000
  RIP: 0010:[&lt;ffffffff8144e944&gt;]  [&lt;ffffffff8144e944&gt;] blk_throtl_drain+0x84/0x230
  RSP: 0018:ffff88000efd7bf0  EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
  RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001
  R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450
  R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28
  FS:  00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0
  Stack:
   ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80
   ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58
   ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450
  Call Trace:
   [&lt;ffffffff8144ae2f&gt;] blkcg_drain_queue+0x1f/0x60
   [&lt;ffffffff81427641&gt;] __blk_drain_queue+0x71/0x180
   [&lt;ffffffff81429b3e&gt;] blk_queue_bypass_start+0x6e/0xb0
   [&lt;ffffffff814498b8&gt;] blkcg_deactivate_policy+0x38/0x120
   [&lt;ffffffff8144ec44&gt;] blk_throtl_exit+0x34/0x50
   [&lt;ffffffff8144aea5&gt;] blkcg_exit_queue+0x35/0x40
   [&lt;ffffffff8142d476&gt;] blk_release_queue+0x26/0xd0
   [&lt;ffffffff81454968&gt;] kobject_cleanup+0x38/0x70
   [&lt;ffffffff81454848&gt;] kobject_put+0x28/0x60
   [&lt;ffffffff81427505&gt;] blk_put_queue+0x15/0x20
   [&lt;ffffffff817d07bb&gt;] scsi_device_dev_release_usercontext+0x16b/0x1c0
   [&lt;ffffffff810bc339&gt;] execute_in_process_context+0x89/0xa0
   [&lt;ffffffff817d064c&gt;] scsi_device_dev_release+0x1c/0x20
   [&lt;ffffffff817930e2&gt;] device_release+0x32/0xa0
   [&lt;ffffffff81454968&gt;] kobject_cleanup+0x38/0x70
   [&lt;ffffffff81454848&gt;] kobject_put+0x28/0x60
   [&lt;ffffffff817934d7&gt;] put_device+0x17/0x20
   [&lt;ffffffff817d11b9&gt;] __scsi_remove_device+0xa9/0xe0
   [&lt;ffffffff817d121b&gt;] scsi_remove_device+0x2b/0x40
   [&lt;ffffffff817d1257&gt;] sdev_store_delete+0x27/0x30
   [&lt;ffffffff81792ca8&gt;] dev_attr_store+0x18/0x30
   [&lt;ffffffff8126f75e&gt;] sysfs_kf_write+0x3e/0x50
   [&lt;ffffffff8126ea87&gt;] kernfs_fop_write+0xe7/0x170
   [&lt;ffffffff811f5e9f&gt;] vfs_write+0xaf/0x1d0
   [&lt;ffffffff811f69bd&gt;] SyS_write+0x4d/0xc0
   [&lt;ffffffff81d24692&gt;] system_call_fastpath+0x16/0x1b

776687bce42b ("block, blk-mq: draining can't be skipped even if
bypass_depth was non-zero") made it easier to trigger this bug by
making blk_queue_bypass_start() drain even when it loses the first
bypass test to blk_cleanup_queue(); however, the bug has always been
there even before the commit as blk_queue_bypass_start() could race
against queue destruction, win the initial bypass test but perform the
actual draining after blk_cleanup_queue() already destroyed all blkgs.

Fix it by skippping calling into policy draining if all the blkgs are
already gone.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Shirish Pargaonkar &lt;spargaonkar@suse.com&gt;
Reported-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Reported-by: Jet Chen &lt;jet.chen@intel.com&gt;
Cc: stable@vger.kernel.org
Tested-by: Shirish Pargaonkar &lt;spargaonkar@suse.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>block: don't assume last put of shared tags is for the host</title>
<updated>2014-07-08T10:25:28Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2014-07-08T10:25:28Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d45b3279a5a2252cafcd665bbf2db8c9b31ef783'/>
<id>urn:sha1:d45b3279a5a2252cafcd665bbf2db8c9b31ef783</id>
<content type='text'>
There is no inherent reason why the last put of a tag structure must be
the one for the Scsi_Host, as device model objects can be held for
arbitrary periods.  Merge blk_free_tags and __blk_free_tags into a single
funtion that just release a references and get rid of the BUG() when the
host reference wasn't the last.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: stable@kernel.org
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.dk/linux-block</title>
<updated>2014-06-26T20:06:13Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-06-26T20:06:13Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=3493860c76eb4e5b222f1016d1458615afac9fc4'/>
<id>urn:sha1:3493860c76eb4e5b222f1016d1458615afac9fc4</id>
<content type='text'>
Pull block fixes from Jens Axboe:
 "A small collection of fixes/changes for the current series.  This
  contains:

   - Removal of dead code from Gu Zheng.

   - Revert of two bad fixes that went in earlier in this round, marking
     things as __init that were not purely used from init.

   - A fix for blk_mq_start_hw_queue() using the __blk_mq_run_hw_queue(),
     which could place us wrongly.  Make it use the non __ variant,
     which handles cases where we are called from the wrong CPU set.
     From me.

   - A fix for drbd, which allocates discard requests without room for
     the SCSI payload.  From Lars Ellenberg.

   - A fix for user-after-free in the blkcg code from Tejun.

   - Addition of limiting gaps in SG lists, if the hardware needs it.
     This is the last pre-req patch for blk-mq to enable the full NVMe
     conversion.  Could wait until 3.17, but it's simple enough so would
     be nice to have everything we need for the NVMe port in the 3.17
     release.  From me"

* 'for-linus' of git://git.kernel.dk/linux-block:
  drbd: fix NULL pointer deref in blk_add_request_payload
  blk-mq: blk_mq_start_hw_queue() should use blk_mq_run_hw_queue()
  block: add support for limiting gaps in SG lists
  bio: remove unused macro bip_vec_idx()
  Revert "block: add __init to elv_register"
  Revert "block: add __init to blkcg_policy_register"
  blkcg: fix use-after-free in __blkg_release_rcu() by making blkcg_gq refcnt an atomic_t
  floppy: format block0 read error message properly
</content>
</entry>
<entry>
<title>blk-mq: blk_mq_start_hw_queue() should use blk_mq_run_hw_queue()</title>
<updated>2014-06-25T14:22:34Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2014-06-25T14:22:34Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0ffbce80c263821161190f20e74a12f7aa8eab7b'/>
<id>urn:sha1:0ffbce80c263821161190f20e74a12f7aa8eab7b</id>
<content type='text'>
Currently it calls __blk_mq_run_hw_queue(), which depends on the
CPU placement being correct. This means it's not possible to call
blk_mq_start_hw_queues(q) from a context that is correct for all
queues, leading to triggering the

WARN_ON(!cpumask_test_cpu(raw_smp_processor_id(), hctx-&gt;cpumask));

in __blk_mq_run_hw_queue().

Reported-by: Ming Lei &lt;tom.leiming@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>block: add support for limiting gaps in SG lists</title>
<updated>2014-06-24T22:22:24Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2014-06-24T22:22:24Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=66cb45aa41315d1d9972cada354fbdf7870d7714'/>
<id>urn:sha1:66cb45aa41315d1d9972cada354fbdf7870d7714</id>
<content type='text'>
Another restriction inherited for NVMe - those devices don't support
SG lists that have "gaps" in them. Gaps refers to cases where the
previous SG entry doesn't end on a page boundary. For NVMe, all SG
entries must start at offset 0 (except the first) and end on a page
boundary (except the last).

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>Revert "block: add __init to elv_register"</title>
<updated>2014-06-22T22:34:11Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2014-06-22T22:32:48Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=e567bf7112518824830978d644dfb5a991e67d54'/>
<id>urn:sha1:e567bf7112518824830978d644dfb5a991e67d54</id>
<content type='text'>
This reverts commit b5097e956a4d2919ee248d6481e4204c5568ed5c.

The original commit is buggy, we do use the registration functions
at runtime, for instance when loading IO schedulers through sysfs.

Reported-by: Damien Wyart &lt;damien.wyart@gmail.com&gt;
</content>
</entry>
<entry>
<title>Revert "block: add __init to blkcg_policy_register"</title>
<updated>2014-06-22T22:34:11Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2014-06-22T22:31:56Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d5bf02914ea3ecf28db4f830f136dc04146b2317'/>
<id>urn:sha1:d5bf02914ea3ecf28db4f830f136dc04146b2317</id>
<content type='text'>
This reverts commit a2d445d440003f2d70ee4cd4970ea82ace616fee.

The original commit is buggy, we do use the registration functions
at runtime for modular builds.
</content>
</entry>
<entry>
<title>blkcg: fix use-after-free in __blkg_release_rcu() by making blkcg_gq refcnt an atomic_t</title>
<updated>2014-06-22T22:30:52Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2014-06-19T21:42:57Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=a5049a8ae34950249a7ae94c385d7c5c98914412'/>
<id>urn:sha1:a5049a8ae34950249a7ae94c385d7c5c98914412</id>
<content type='text'>
Hello,

So, this patch should do.  Joe, Vivek, can one of you guys please
verify that the oops goes away with this patch?

Jens, the original thread can be read at

  http://thread.gmane.org/gmane.linux.kernel/1720729

The fix converts blkg-&gt;refcnt from int to atomic_t.  It does some
overhead but it should be minute compared to everything else which is
going on and the involved cacheline bouncing, so I think it's highly
unlikely to cause any noticeable difference.  Also, the refcnt in
question should be converted to a perpcu_ref for blk-mq anyway, so the
atomic_t is likely to go away pretty soon anyway.

Thanks.

------- 8&lt; -------
__blkg_release_rcu() may be invoked after the associated request_queue
is released with a RCU grace period inbetween.  As such, the function
and callbacks invoked from it must not dereference the associated
request_queue.  This is clearly indicated in the comment above the
function.

Unfortunately, while trying to fix a different issue, 2a4fd070ee85
("blkcg: move bulk of blkcg_gq release operations to the RCU
callback") ignored this and added [un]locking of @blkg-&gt;q-&gt;queue_lock
to __blkg_release_rcu().  This of course can cause oops as the
request_queue may be long gone by the time this code gets executed.

  general protection fault: 0000 [#1] SMP
  CPU: 21 PID: 30 Comm: rcuos/21 Not tainted 3.15.0 #1
  Hardware name: Stratus ftServer 6400/G7LAZ, BIOS BIOS Version 6.3:57 12/25/2013
  task: ffff880854021de0 ti: ffff88085403c000 task.ti: ffff88085403c000
  RIP: 0010:[&lt;ffffffff8162e9e5&gt;]  [&lt;ffffffff8162e9e5&gt;] _raw_spin_lock_irq+0x15/0x60
  RSP: 0018:ffff88085403fdf0  EFLAGS: 00010086
  RAX: 0000000000020000 RBX: 0000000000000010 RCX: 0000000000000000
  RDX: 000060ef80008248 RSI: 0000000000000286 RDI: 6b6b6b6b6b6b6b6b
  RBP: ffff88085403fdf0 R08: 0000000000000286 R09: 0000000000009f39
  R10: 0000000000020001 R11: 0000000000020001 R12: ffff88103c17a130
  R13: ffff88103c17a080 R14: 0000000000000000 R15: 0000000000000000
  FS:  0000000000000000(0000) GS:ffff88107fca0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000006e5ab8 CR3: 000000000193d000 CR4: 00000000000407e0
  Stack:
   ffff88085403fe18 ffffffff812cbfc2 ffff88103c17a130 0000000000000000
   ffff88103c17a130 ffff88085403fec0 ffffffff810d1d28 ffff880854021de0
   ffff880854021de0 ffff88107fcaec58 ffff88085403fe80 ffff88107fcaec30
  Call Trace:
   [&lt;ffffffff812cbfc2&gt;] __blkg_release_rcu+0x72/0x150
   [&lt;ffffffff810d1d28&gt;] rcu_nocb_kthread+0x1e8/0x300
   [&lt;ffffffff81091d81&gt;] kthread+0xe1/0x100
   [&lt;ffffffff8163813c&gt;] ret_from_fork+0x7c/0xb0
  Code: ff 47 04 48 8b 7d 08 be 00 02 00 00 e8 55 48 a4 ff 5d c3 0f 1f 00 66 66 66 66 90 55 48 89 e5
  +fa 66 66 90 66 66 90 b8 00 00 02 00 &lt;f0&gt; 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f
  +b7
  RIP  [&lt;ffffffff8162e9e5&gt;] _raw_spin_lock_irq+0x15/0x60
   RSP &lt;ffff88085403fdf0&gt;

The request_queue locking was added because blkcg_gq-&gt;refcnt is an int
protected with the queue lock and __blkg_release_rcu() needs to put
the parent.  Let's fix it by making blkcg_gq-&gt;refcnt an atomic_t and
dropping queue locking in the function.

Given the general heavy weight of the current request_queue and blkcg
operations, this is unlikely to cause any noticeable overhead.
Moreover, blkcg_gq-&gt;refcnt is likely to be converted to percpu_ref in
the near future, so whatever (most likely negligible) overhead it may
add is temporary.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Joe Lawrence &lt;joe.lawrence@stratus.com&gt;
Acked-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Link: http://lkml.kernel.org/g/alpine.DEB.2.02.1406081816540.17948@jlaw-desktop.mno.stratus.com
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.dk/linux-block</title>
<updated>2014-06-20T03:56:43Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-06-20T03:56:43Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f1d702487b3bc16466ad9b4e5c76277b6829d34c'/>
<id>urn:sha1:f1d702487b3bc16466ad9b4e5c76277b6829d34c</id>
<content type='text'>
Pull block fixes from Jens Axboe:
 "A smaller collection of fixes for the block core that would be nice to
  have in -rc2.  This pull request contains:

   - Fixes for races in the wait/wakeup logic used in blk-mq from
     Alexander.  No issues have been observed, but it is definitely a
     bit flakey currently.  Alternatively, we may drop the cyclic
     wakeups going forward, but that needs more testing.

   - Some cleanups from Christoph.

   - Fix for an oops in null_blk if queue_mode=1 and softirq completions
     are used.  From me.

   - A fix for a regression caused by the chunk size setting.  It
     inadvertently used max_hw_sectors instead of max_sectors, which is
     incorrect, and causes hangs on btrfs multi-disk setups (where hw
     sectors apparently isn't set).  From me.

   - Removal of WQ_POWER_EFFICIENT in the kblockd creation.  This was a
     recent addition as well, but it actually breaks blk-mq which relies
     on strict scheduling.  If the workqueue power_efficient mode is
     turned on, this breaks blk-mq.  From Matias.

   - null_blk module parameter description fix from Mike"

* 'for-linus' of git://git.kernel.dk/linux-block:
  blk-mq: bitmap tag: fix races in bt_get() function
  blk-mq: bitmap tag: fix race on blk_mq_bitmap_tags::wake_cnt
  blk-mq: bitmap tag: fix races on shared ::wake_index fields
  block: blk_max_size_offset() should check -&gt;max_sectors
  null_blk: fix softirq completions for queue_mode == 1
  blk-mq: merge blk_mq_drain_queue and __blk_mq_drain_queue
  blk-mq: properly drain stopped queues
  block: remove WQ_POWER_EFFICIENT from kblockd
  null_blk: fix name and description of 'queue_mode' module parameter
  block: remove elv_abort_queue and blk_abort_flushes
</content>
</entry>
</feed>
