<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/block, branch v3.4.96</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/drivers/block?h=v3.4.96</id>
<link rel='self' href='https://git.amat.us/linux/atom/drivers/block?h=v3.4.96'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2014-06-11T19:04:18Z</updated>
<entry>
<title>virtio-blk: Don't free ida when disk is in use</title>
<updated>2014-06-11T19:04:18Z</updated>
<author>
<name>Alexander Graf</name>
<email>agraf@suse.de</email>
</author>
<published>2013-01-02T05:07:17Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=7f3874ea67992faee3a7ff4dca382d0cb0ee0c06'/>
<id>urn:sha1:7f3874ea67992faee3a7ff4dca382d0cb0ee0c06</id>
<content type='text'>
commit f4953fe6c4aeada2d5cafd78aa97587a46d2d8f9 upstream.

When a file system is mounted on a virtio-blk disk, we then remove it
and then reattach it, the reattached disk gets the same disk name and
ids as the hot removed one.

This leads to very nasty effects - mostly rendering the newly attached
device completely unusable.

Trying what happens when I do the same thing with a USB device, I saw
that the sd node simply doesn't get free'd when a device gets forcefully
removed.

Imitate the same behavior for vd devices. This way broken vd devices
simply are never free'd and newly attached ones keep working just fine.

Signed-off-by: Alexander Graf &lt;agraf@suse.de&gt;
Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Yijing Wang &lt;wangyijing@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>virtio-blk: Reset device after blk_cleanup_queue()</title>
<updated>2014-06-11T19:04:17Z</updated>
<author>
<name>Asias He</name>
<email>asias@redhat.com</email>
</author>
<published>2012-05-25T02:34:48Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=984ca88a55ccf51d4eeef67ce2247c58694a2dc5'/>
<id>urn:sha1:984ca88a55ccf51d4eeef67ce2247c58694a2dc5</id>
<content type='text'>
commit 483001c765af6892b3fc3726576cb42f17d1d6b5 upstream.

blk_cleanup_queue() will call blk_drian_queue() to drain all the
requests before queue DEAD marking. If we reset the device before
blk_cleanup_queue() the drain would fail.

1) if the queue is stopped in do_virtblk_request() because device is
full, the q-&gt;request_fn() will not be called.

blk_drain_queue() {
   while(true) {
      ...
      if (!list_empty(&amp;q-&gt;queue_head))
        __blk_run_queue(q) {
	    if (queue is not stoped)
		q-&gt;request_fn()
	}
      ...
   }
}

Do no reset the device before blk_cleanup_queue() gives the chance to
start the queue in interrupt handler blk_done().

2) In commit b79d866c8b7014a51f611a64c40546109beaf24a, We abort requests
dispatched to driver before blk_cleanup_queue(). There is a race if
requests are dispatched to driver after the abort and before the queue
DEAD mark. To fix this, instead of aborting the requests explicitly, we
can just reset the device after after blk_cleanup_queue so that the
device can complete all the requests before queue DEAD marking in the
drain process.

Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: virtualization@lists.linux-foundation.org
Cc: kvm@vger.kernel.org
Signed-off-by: Asias He &lt;asias@redhat.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Yijing Wang &lt;wangyijing@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>virtio-blk: Call del_gendisk() before disable guest kick</title>
<updated>2014-06-11T19:04:17Z</updated>
<author>
<name>Asias He</name>
<email>asias@redhat.com</email>
</author>
<published>2012-05-25T02:34:47Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=af6e24de4b787c77070bbaa2d0cd8e479f1db2d7'/>
<id>urn:sha1:af6e24de4b787c77070bbaa2d0cd8e479f1db2d7</id>
<content type='text'>
commit 02e2b124943648fba0a2ccee5c3656a5653e0151 upstream.

del_gendisk() might not return due to failing to remove the
/sys/block/vda/serial sysfs entry when another thread (udev) is
trying to read it.

virtblk_remove()
  vdev-&gt;config-&gt;reset() : guest will not kick us through interrupt
    del_gendisk()
      device_del()
        kobject_del(): got stuck, sysfs entry ref count non zero

sysfs_open_file(): user space process read /sys/block/vda/serial
   sysfs_get_active() : got sysfs entry ref count
      dev_attr_show()
        virtblk_serial_show()
           blk_execute_rq() : got stuck, interrupt is disabled
                              request cannot be finished

This patch fixes it by calling del_gendisk() before we disable guest's
interrupt so that the request sent in virtblk_serial_show() will be
finished and del_gendisk() will success.

This fixes another race in hot-unplug process.

It is save to call del_gendisk(vblk-&gt;disk) before
flush_work(&amp;vblk-&gt;config_work) which might access vblk-&gt;disk, because
vblk-&gt;disk is not freed until put_disk(vblk-&gt;disk).

Cc: virtualization@lists.linux-foundation.org
Cc: kvm@vger.kernel.org
Signed-off-by: Asias He &lt;asias@redhat.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Yijing Wang &lt;wangyijing@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>virtio-blk: Fix hot-unplug race in remove method</title>
<updated>2014-06-11T19:04:17Z</updated>
<author>
<name>Asias He</name>
<email>asias@redhat.com</email>
</author>
<published>2012-05-04T12:22:04Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f515bcabb5e4514126ab1635c215256cce4163a2'/>
<id>urn:sha1:f515bcabb5e4514126ab1635c215256cce4163a2</id>
<content type='text'>
commit b79d866c8b7014a51f611a64c40546109beaf24a upstream.

If we reset the virtio-blk device before the requests already dispatched
to the virtio-blk driver from the block layer are finised, we will stuck
in blk_cleanup_queue() and the remove will fail.

blk_cleanup_queue() calls blk_drain_queue() to drain all requests queued
before DEAD marking. However it will never success if the device is
already stopped. We'll have q-&gt;in_flight[] &gt; 0, so the drain will not
finish.

How to reproduce the race:
1. hot-plug a virtio-blk device
2. keep reading/writing the device in guest
3. hot-unplug while the device is busy serving I/O

Test:
~1000 rounds of hot-plug/hot-unplug test passed with this patch.

Changes in v3:
- Drop blk_abort_queue and blk_abort_request
- Use __blk_end_request_all to complete request dispatched to driver

Changes in v2:
- Drop req_in_flight
- Use virtqueue_detach_unused_buf to get request dispatched to driver

Signed-off-by: Asias He &lt;asias@redhat.com&gt;
Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Yijing Wang &lt;wangyijing@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>virtio_blk: Drop unused request tracking list</title>
<updated>2014-06-11T19:04:17Z</updated>
<author>
<name>Asias He</name>
<email>asias@redhat.com</email>
</author>
<published>2012-03-30T03:24:10Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=36bd1996971bb124f4629994376c5fbb99a0e855'/>
<id>urn:sha1:36bd1996971bb124f4629994376c5fbb99a0e855</id>
<content type='text'>
commit f65ca1dc6a8c81c6bd72297d4399ec5f4c1f3a01 upstream.

Benchmark shows small performance improvement on fusion io device.

Before:
  seq-read : io=1,024MB, bw=19,982KB/s, iops=39,964, runt= 52475msec
  seq-write: io=1,024MB, bw=20,321KB/s, iops=40,641, runt= 51601msec
  rnd-read : io=1,024MB, bw=15,404KB/s, iops=30,808, runt= 68070msec
  rnd-write: io=1,024MB, bw=14,776KB/s, iops=29,552, runt= 70963msec

After:
  seq-read : io=1,024MB, bw=20,343KB/s, iops=40,685, runt= 51546msec
  seq-write: io=1,024MB, bw=20,803KB/s, iops=41,606, runt= 50404msec
  rnd-read : io=1,024MB, bw=16,221KB/s, iops=32,442, runt= 64642msec
  rnd-write: io=1,024MB, bw=15,199KB/s, iops=30,397, runt= 68991msec

Signed-off-by: Asias He &lt;asias@redhat.com&gt;
Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Yijing Wang &lt;wangyijing@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>nbd: fsync and kill block device on shutdown</title>
<updated>2014-06-07T23:02:10Z</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2013-02-28T01:05:25Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f40ed19c957e23cee2a178a688c45197aa1bc30c'/>
<id>urn:sha1:f40ed19c957e23cee2a178a688c45197aa1bc30c</id>
<content type='text'>
commit 3a2d63f87989e01437ba994df5f297528c353d7d upstream.

There are two problems with shutdown in the NBD driver.

1: Receiving the NBD_DISCONNECT ioctl does not sync the filesystem.

   This patch adds the sync operation into __nbd_ioctl()'s
   NBD_DISCONNECT handler.  This is useful because BLKFLSBUF is restricted
   to processes that have CAP_SYS_ADMIN, and the NBD client may not
   possess it (fsync of the block device does not sync the filesystem,
   either).

2: Once we clear the socket we have no guarantee that later reads will
   come from the same backing storage.

   The patch adds calls to kill_bdev() in __nbd_ioctl()'s socket
   clearing code so the page cache is cleaned, lest reads that hit on the
   page cache will return stale data from the previously-accessible disk.

Example:

    # qemu-nbd -r -c/dev/nbd0 /dev/sr0
    # file -s /dev/nbd0
    /dev/stdin: # UDF filesystem data (version 1.5) etc.
    # qemu-nbd -d /dev/nbd0
    # qemu-nbd -r -c/dev/nbd0 /dev/sda
    # file -s /dev/nbd0
    /dev/stdin: # UDF filesystem data (version 1.5) etc.

While /dev/sda has:

    # file -s /dev/sda
    /dev/sda: x86 boot sector; etc.

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Acked-by: Paul Clements &lt;Paul.Clements@steeleye.com&gt;
Cc: Alex Bligh &lt;alex@alex.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[bwh: Backported to 3.2:
 - Adjusted context
 - s/\bnbd\b/lo/
 - Incorporate export of kill_bdev() from commit ff01bb483265
   ('fs: move code out of buffer.c')]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
[hq: Backported to 3.4: Adjusted context]
Signed-off-by: Qiang Huang &lt;h.huangqiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>floppy: properly handle failure on add_disk loop</title>
<updated>2014-06-07T23:02:06Z</updated>
<author>
<name>Herton Ronaldo Krzesinski</name>
<email>herton.krzesinski@canonical.com</email>
</author>
<published>2012-08-27T23:56:54Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=85923c3d426b2059121171c042ccd82fc1beab21'/>
<id>urn:sha1:85923c3d426b2059121171c042ccd82fc1beab21</id>
<content type='text'>
commit d60e7ec18c3fb2cbf90969ccd42889eb2d03aef9 upstream.

On floppy initialization, if something failed inside the loop we call
add_disk, there was no cleanup of previous iterations in the error
handling.

Signed-off-by: Herton Ronaldo Krzesinski &lt;herton.krzesinski@canonical.com&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Qiang Huang &lt;h.huangqiang@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>floppy: don't write kernel-only members to FDRAWCMD ioctl output</title>
<updated>2014-05-13T12:11:29Z</updated>
<author>
<name>Matthew Daley</name>
<email>mattd@bugfuzz.com</email>
</author>
<published>2014-04-28T07:05:21Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=bfa779779247e9421b23155747e1779ce87aa040'/>
<id>urn:sha1:bfa779779247e9421b23155747e1779ce87aa040</id>
<content type='text'>
commit 2145e15e0557a01b9195d1c7199a1b92cb9be81f upstream.

Do not leak kernel-only floppy_raw_cmd structure members to userspace.
This includes the linked-list pointer and the pointer to the allocated
DMA space.

Signed-off-by: Matthew Daley &lt;mattd@bugfuzz.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>floppy: ignore kernel-only members in FDRAWCMD ioctl input</title>
<updated>2014-05-13T12:11:29Z</updated>
<author>
<name>Matthew Daley</name>
<email>mattd@bugfuzz.com</email>
</author>
<published>2014-04-28T07:05:20Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=a04d8ef979b661ce9051d0f22b0b2c25fc94b955'/>
<id>urn:sha1:a04d8ef979b661ce9051d0f22b0b2c25fc94b955</id>
<content type='text'>
commit ef87dbe7614341c2e7bfe8d32fcb7028cc97442c upstream.

Always clear out these floppy_raw_cmd struct members after copying the
entire structure from userspace so that the in-kernel version is always
valid and never left in an interdeterminate state.

Signed-off-by: Matthew Daley &lt;mattd@bugfuzz.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>xen/blkback: Check for insane amounts of request on the ring (v6).</title>
<updated>2014-03-11T23:10:07Z</updated>
<author>
<name>Konrad Rzeszutek Wilk</name>
<email>konrad.wilk@oracle.com</email>
</author>
<published>2013-01-23T21:54:32Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=62047439828b2f7c984b7471d00dc11a07a1452b'/>
<id>urn:sha1:62047439828b2f7c984b7471d00dc11a07a1452b</id>
<content type='text'>
commit 9371cadbbcc7c00c81753b9727b19fb3bc74d458 upstream.

commit 8e3f8755545cc4a7f4da8e9ef76d6d32e0dca576 upstream.

Check that the ring does not have an insane amount of requests
(more than there could fit on the ring).

If we detect this case we will stop processing the requests
and wait until the XenBus disconnects the ring.

The existing check RING_REQUEST_CONS_OVERFLOW which checks for how
many responses we have created in the past (rsp_prod_pvt) vs
requests consumed (req_cons) and whether said difference is greater or
equal to the size of the ring, does not catch this case.

Wha the condition does check if there is a need to process more
as we still have a backlog of responses to finish. Note that both
of those values (rsp_prod_pvt and req_cons) are not exposed on the
shared ring.

To understand this problem a mini crash course in ring protocol
response/request updates is in place.

There are four entries: req_prod and rsp_prod; req_event and rsp_event
to track the ring entries. We are only concerned about the first two -
which set the tone of this bug.

The req_prod is a value incremented by frontend for each request put
on the ring. Conversely the rsp_prod is a value incremented by the backend
for each response put on the ring (rsp_prod gets set by rsp_prod_pvt when
pushing the responses on the ring).  Both values can
wrap and are modulo the size of the ring (in block case that is 32).
Please see RING_GET_REQUEST and RING_GET_RESPONSE for the more details.

The culprit here is that if the difference between the
req_prod and req_cons is greater than the ring size we have a problem.
Fortunately for us, the '__do_block_io_op' loop:

	rc = blk_rings-&gt;common.req_cons;
	rp = blk_rings-&gt;common.sring-&gt;req_prod;

	while (rc != rp) {

		..
		blk_rings-&gt;common.req_cons = ++rc; /* before make_response() */

	}

will loop up to the point when rc == rp. The macros inside of the
loop (RING_GET_REQUEST) is smart and is indexing based on the modulo
of the ring size. If the frontend has provided a bogus req_prod value
we will loop until the 'rc == rp' - which means we could be processing
already processed requests (or responses) often.

The reason the RING_REQUEST_CONS_OVERFLOW is not helping here is
b/c it only tracks how many responses we have internally produced
and whether we would should process more. The astute reader will
notice that the macro RING_REQUEST_CONS_OVERFLOW provides two
arguments - more on this later.

For example, if we were to enter this function with these values:

       	blk_rings-&gt;common.sring-&gt;req_prod =  X+31415 (X is the value from
		the last time __do_block_io_op was called).
        blk_rings-&gt;common.req_cons = X
        blk_rings-&gt;common.rsp_prod_pvt = X

The RING_REQUEST_CONS_OVERFLOW(&amp;blk_rings-&gt;common, blk_rings-&gt;common.req_cons)
is doing:

	req_cons - rsp_prod_pvt &gt;= 32

Which is,
	X - X &gt;= 32 or 0 &gt;= 32

And that is false, so we continue on looping (this bug).

If we re-use said macro RING_REQUEST_CONS_OVERFLOW and pass in the rp
instead (sring-&gt;req_prod) of rc, the this macro can do the check:

     req_prod - rsp_prov_pvt &gt;= 32

Which is,
       X + 31415 - X &gt;= 32 , or 31415 &gt;= 32

which is true, so we can error out and break out of the function.

Unfortunatly the difference between rsp_prov_pvt and req_prod can be
at 32 (which would error out in the macro). This condition exists when
the backend is lagging behind with the responses and still has not finished
responding to all of them (so make_response has not been called), and
the rsp_prov_pvt + 32 == req_cons. This ends up with us not being able
to use said macro.

Hence introducing a new macro called RING_REQUEST_PROD_OVERFLOW which does
a simple check of:

    req_prod - rsp_prod_pvt &gt; RING_SIZE

And with the X values from above:

   X + 31415 - X &gt; 32

Returns true. Also not that if the ring is full (which is where
the RING_REQUEST_CONS_OVERFLOW triggered), we would not hit the
same condition:

   X + 32 - X &gt; 32

Which is false.

Lets use that macro.
Note that in v5 of this patchset the macro was different - we used an
earlier version.

[v1: Move the check outside the loop]
[v2: Add a pr_warn as suggested by David]
[v3: Use RING_REQUEST_CONS_OVERFLOW as suggested by Jan]
[v4: Move wake_up after kthread_stop as suggested by Jan]
[v5: Use RING_REQUEST_PROD_OVERFLOW instead]
[v6: Use RING_REQUEST_PROD_OVERFLOW - Jan's version]
Signed-off-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Reviewed-by: Jan Beulich &lt;jbeulich@suse.com&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Yijing Wang &lt;wangyijing@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
