<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs/block_dev.c, branch v3.0.36</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/fs/block_dev.c?h=v3.0.36</id>
<link rel='self' href='https://git.amat.us/linux/atom/fs/block_dev.c?h=v3.0.36'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2012-06-01T07:12:52Z</updated>
<entry>
<title>block: don't mark buffers beyond end of disk as mapped</title>
<updated>2012-06-01T07:12:52Z</updated>
<author>
<name>Jeff Moyer</name>
<email>jmoyer@redhat.com</email>
</author>
<published>2012-05-11T14:34:10Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=37a8457773c266cdb77fcddec008cd73e81786be'/>
<id>urn:sha1:37a8457773c266cdb77fcddec008cd73e81786be</id>
<content type='text'>
commit 080399aaaf3531f5b8761ec0ac30ff98891e8686 upstream.

Hi,

We have a bug report open where a squashfs image mounted on ppc64 would
exhibit errors due to trying to read beyond the end of the disk.  It can
easily be reproduced by doing the following:

[root@ibm-p750e-02-lp3 ~]# ls -l install.img
-rw-r--r-- 1 root root 142032896 Apr 30 16:46 install.img
[root@ibm-p750e-02-lp3 ~]# mount -o loop ./install.img /mnt/test
[root@ibm-p750e-02-lp3 ~]# dd if=/dev/loop0 of=/dev/null
dd: reading `/dev/loop0': Input/output error
277376+0 records in
277376+0 records out
142016512 bytes (142 MB) copied, 0.9465 s, 150 MB/s

In dmesg, you'll find the following:

squashfs: version 4.0 (2009/01/31) Phillip Lougher
[   43.106012] attempt to access beyond end of device
[   43.106029] loop0: rw=0, want=277410, limit=277408
[   43.106039] Buffer I/O error on device loop0, logical block 138704
[   43.106053] attempt to access beyond end of device
[   43.106057] loop0: rw=0, want=277412, limit=277408
[   43.106061] Buffer I/O error on device loop0, logical block 138705
[   43.106066] attempt to access beyond end of device
[   43.106070] loop0: rw=0, want=277414, limit=277408
[   43.106073] Buffer I/O error on device loop0, logical block 138706
[   43.106078] attempt to access beyond end of device
[   43.106081] loop0: rw=0, want=277416, limit=277408
[   43.106085] Buffer I/O error on device loop0, logical block 138707
[   43.106089] attempt to access beyond end of device
[   43.106093] loop0: rw=0, want=277418, limit=277408
[   43.106096] Buffer I/O error on device loop0, logical block 138708
[   43.106101] attempt to access beyond end of device
[   43.106104] loop0: rw=0, want=277420, limit=277408
[   43.106108] Buffer I/O error on device loop0, logical block 138709
[   43.106112] attempt to access beyond end of device
[   43.106116] loop0: rw=0, want=277422, limit=277408
[   43.106120] Buffer I/O error on device loop0, logical block 138710
[   43.106124] attempt to access beyond end of device
[   43.106128] loop0: rw=0, want=277424, limit=277408
[   43.106131] Buffer I/O error on device loop0, logical block 138711
[   43.106135] attempt to access beyond end of device
[   43.106139] loop0: rw=0, want=277426, limit=277408
[   43.106143] Buffer I/O error on device loop0, logical block 138712
[   43.106147] attempt to access beyond end of device
[   43.106151] loop0: rw=0, want=277428, limit=277408
[   43.106154] Buffer I/O error on device loop0, logical block 138713
[   43.106158] attempt to access beyond end of device
[   43.106162] loop0: rw=0, want=277430, limit=277408
[   43.106166] attempt to access beyond end of device
[   43.106169] loop0: rw=0, want=277432, limit=277408
...
[   43.106307] attempt to access beyond end of device
[   43.106311] loop0: rw=0, want=277470, limit=2774

Squashfs manages to read in the end block(s) of the disk during the
mount operation.  Then, when dd reads the block device, it leads to
block_read_full_page being called with buffers that are beyond end of
disk, but are marked as mapped.  Thus, it would end up submitting read
I/O against them, resulting in the errors mentioned above.  I fixed the
problem by modifying init_page_buffers to only set the buffer mapped if
it fell inside of i_size.

Cheers,
Jeff

Signed-off-by: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Acked-by: Nick Piggin &lt;npiggin@kernel.dk&gt;

--

Changes from v1-&gt;v2: re-used max_block, as suggested by Nick Piggin.
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>block: Fix NULL pointer dereference in sd_revalidate_disk</title>
<updated>2012-03-19T15:57:58Z</updated>
<author>
<name>Jun'ichi Nomura</name>
<email>j-nomura@ce.jp.nec.com</email>
</author>
<published>2012-03-02T09:38:33Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=57babcb863bb689d49db626b4b69ce3629001fc1'/>
<id>urn:sha1:57babcb863bb689d49db626b4b69ce3629001fc1</id>
<content type='text'>
commit fe316bf2d5847bc5dd975668671a7b1067603bc7 upstream.

Since 2.6.39 (1196f8b), when a driver returns -ENOMEDIUM for open(),
__blkdev_get() calls rescan_partitions() to remove
in-kernel partition structures and raise KOBJ_CHANGE uevent.

However it ends up calling driver's revalidate_disk without open
and could cause oops.

In the case of SCSI:

  process A                  process B
  ----------------------------------------------
  sys_open
    __blkdev_get
      sd_open
        returns -ENOMEDIUM
                             scsi_remove_device
                               &lt;scsi_device torn down&gt;
      rescan_partitions
        sd_revalidate_disk
          &lt;oops&gt;
Oopses are reported here:
http://marc.info/?l=linux-scsi&amp;m=132388619710052

This patch separates the partition invalidation from rescan_partitions()
and use it for -ENOMEDIUM case.

Reported-by: Huajun Li &lt;huajun.li.lee@gmail.com&gt;
Signed-off-by: Jun'ichi Nomura &lt;j-nomura@ce.jp.nec.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>block: make gendisk hold a reference to its queue</title>
<updated>2011-11-11T17:37:07Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-10-17T11:42:43Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f53881e6463124c910b17a5ea722a5cf5ab67fb3'/>
<id>urn:sha1:f53881e6463124c910b17a5ea722a5cf5ab67fb3</id>
<content type='text'>
commit f992ae801a7dec34a4ed99a6598bbbbfb82af4fb upstream.

The following command sequence triggers an oops.

# mount /dev/sdb1 /mnt
# echo 1 &gt; /sys/class/scsi_device/0\:0\:1\:0/device/delete
# umount /mnt

 general protection fault: 0000 [#1] PREEMPT SMP
 CPU 2
 Modules linked in:

 Pid: 791, comm: umount Not tainted 3.1.0-rc3-work+ #8 Bochs Bochs
 RIP: 0010:[&lt;ffffffff810d0879&gt;]  [&lt;ffffffff810d0879&gt;] __lock_acquire+0x389/0x1d60
...
 Call Trace:
  [&lt;ffffffff810d2845&gt;] lock_acquire+0x95/0x140
  [&lt;ffffffff81aed87b&gt;] _raw_spin_lock+0x3b/0x50
  [&lt;ffffffff811573bc&gt;] bdi_lock_two+0x5c/0x70
  [&lt;ffffffff811c2f6c&gt;] bdev_inode_switch_bdi+0x4c/0xf0
  [&lt;ffffffff811c3fcb&gt;] __blkdev_put+0x11b/0x1d0
  [&lt;ffffffff811c4010&gt;] __blkdev_put+0x160/0x1d0
  [&lt;ffffffff811c40df&gt;] blkdev_put+0x5f/0x190
  [&lt;ffffffff8118f18d&gt;] kill_block_super+0x4d/0x80
  [&lt;ffffffff8118f4a5&gt;] deactivate_locked_super+0x45/0x70
  [&lt;ffffffff8119003a&gt;] deactivate_super+0x4a/0x70
  [&lt;ffffffff811ac4ad&gt;] mntput_no_expire+0xed/0x130
  [&lt;ffffffff811acf2e&gt;] sys_umount+0x7e/0x3a0
  [&lt;ffffffff81aeeeab&gt;] system_call_fastpath+0x16/0x1b

This is because bdev holds on to disk but disk doesn't pin the
associated queue.  If a SCSI device is removed while the device is
still open, the sdev puts the base reference to the queue on release.
When the bdev is finally released, the associated queue is already
gone along with the bdi and bdev_inode_switch_bdi() ends up
dereferencing already freed bdi.

Even if it were not for this bug, disk not holding onto the associated
queue is very unusual and error-prone.

Fix it by making add_disk() take an extra reference to its queue and
put it on disk_release() and ensuring that disk and its fops owner are
put in that order after all accesses to the disk and queue are
complete.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>Avoid dereferencing a 'request_queue' after last close.</title>
<updated>2011-10-03T18:40:14Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-09-10T07:20:21Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=97e5a85664252fd3f3f84751c7779c326df9c851'/>
<id>urn:sha1:97e5a85664252fd3f3f84751c7779c326df9c851</id>
<content type='text'>
commit 94007751bb02797ba87bac7aacee2731ac2039a3 upstream.

On the last close of an 'md' device which as been stopped, the device
is destroyed and in particular the request_queue is freed.  The free
is done in a separate thread so it might happen a short time later.

__blkdev_put calls bdev_inode_switch_bdi *after* -&gt;release has been
called.

Since commit f758eeabeb96f878c860e8f110f94ec8820822a9
bdev_inode_switch_bdi will dereference the 'old' bdi, which lives
inside a request_queue, to get a spin lock.  This causes the last
close on an md device to sometime take a spin_lock which lives in
freed memory - which results in an oops.

So move the called to bdev_inode_switch_bdi before the call to
-&gt;release.

Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Acked-by: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>block: use the passed in @bdev when claiming if partno is zero</title>
<updated>2011-06-13T10:45:48Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-06-13T10:45:48Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d4c208b86b8be4254eba0e74071496e599f94639'/>
<id>urn:sha1:d4c208b86b8be4254eba0e74071496e599f94639</id>
<content type='text'>
6b4517a791 (block: implement bd_claiming and claiming block)
introduced claiming block to support O_EXCL blkdev opens properly.

bd_start_claiming() looks up the part 0 bdev and starts claiming
block.  The function assumed that there is only one part 0 bdev and
always used bdget_disk(disk, 0) to look it up; unfortunately, this
isn't true for some drivers (floppy) which use multiple block devices
to denote different operating parameters for the same physical device.
There can be multiple part 0 bdev's for the same device number.

This incorrect assumption caused the wrong bdev to be used during
claiming leading to unbalanced bd_holders as reported in the following
bug.

  https://bugzilla.kernel.org/show_bug.cgi?id=28522

This patch updates bd_start_claiming() such that it uses the bdev
specified as argument if its partno is zero.

Note that this means that different bdev's can be used for the same
device and O_EXCL check can be effectively bypassed.  It has always
been broken that way and floppy is fortunately on its way out.  Leave
that breakage alone.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Alex Villacis Lasso &lt;avillaci@ceibo.fiec.espol.edu.ec&gt;
Tested-by: Alex Villacis Lasso &lt;avillaci@ceibo.fiec.espol.edu.ec&gt;
Cc: stable@kernel.org	# &gt;= v2.6.36
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
</entry>
<entry>
<title>block: blkdev_get() should access -&gt;bd_disk only after success</title>
<updated>2011-06-01T06:28:47Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-06-01T06:27:41Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=4c49ff3fe128ca68dabd07537415c419ad7f82f9'/>
<id>urn:sha1:4c49ff3fe128ca68dabd07537415c419ad7f82f9</id>
<content type='text'>
d4dc210f69 (block: don't block events on excl write for non-optical
devices) added dereferencing of bdev-&gt;bd_disk to test
GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE; however, bdev-&gt;bd_disk can be
%NULL if open failed which can lead to an oops.

Test the flag after testing open was successful, not before.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: David Miller &lt;davem@davemloft.net&gt;
Tested-by: David Miller &lt;davem@davemloft.net&gt;
Cc: stable@kernel.org
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
</entry>
<entry>
<title>block: move bd_set_size() above rescan_partitions() in __blkdev_get()</title>
<updated>2011-05-23T11:26:07Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-05-23T11:26:07Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=7e69723fef8771a9d57bd27d36281d756130b4b5'/>
<id>urn:sha1:7e69723fef8771a9d57bd27d36281d756130b4b5</id>
<content type='text'>
02e352287a4 (block: rescan partitions on invalidated devices on
-ENOMEDIA too) relocated partition rescan above explicit bd_set_size()
to simplify condition check.  As rescan_partitions() does its own bdev
size setting, this doesn't break anything; however,
rescan_partitions() prints out the following messages when adjusting
bdev size, which can be confusing.

  sda: detected capacity change from 0 to 146815737856
  sdb: detected capacity change from 0 to 146815737856

This patch restores the original order and remove the warning
messages.

stable: Please apply together with 02e352287a4 (block: rescan
        partitions on invalidated devices on -ENOMEDIA too).

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Tony Luck &lt;tony.luck@gmail.com&gt;
Tested-by: Tony Luck &lt;tony.luck@gmail.com&gt;
Cc: stable@kernel.org

Stable note: 2.6.39 only.
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
</entry>
<entry>
<title>block: don't block events on excl write for non-optical devices</title>
<updated>2011-04-21T18:54:46Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-04-21T18:54:46Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d4dc210f69bcb0b4bef5a83b1c323817be89bad1'/>
<id>urn:sha1:d4dc210f69bcb0b4bef5a83b1c323817be89bad1</id>
<content type='text'>
Disk event code automatically blocks events on excl write.  This is
primarily to avoid issuing polling commands while burning is in
progress.  This behavior doesn't fit other types of devices with
removeable media where polling commands don't have adverse side
effects and door locking usually doesn't exist.

This patch introduces new genhd flag which controls the auto-blocking
behavior and uses it to enable auto-blocking only on optical devices.

Note for stable: 2.6.38 and later only

Cc: stable@kernel.org
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Kay Sievers &lt;kay.sievers@vrfy.org&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
</entry>
<entry>
<title>block: rescan partitions on invalidated devices on -ENOMEDIA too</title>
<updated>2011-04-21T18:54:45Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-04-21T18:54:45Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=1196f8b814f32cd04df334abf47648c2a9fd8324'/>
<id>urn:sha1:1196f8b814f32cd04df334abf47648c2a9fd8324</id>
<content type='text'>
__blkdev_get() doesn't rescan partitions if disk-&gt;fops-&gt;open() fails,
which leads to ghost partition devices lingering after medimum removal
is known to both the kernel and userland.  The behavior also creates a
subtle inconsistency where O_NONBLOCK open, which doesn't fail even if
there's no medium, clears the ghots partitions, which is exploited to
work around the problem from userland.

Fix it by updating __blkdev_get() to issue partition rescan after
-ENOMEDIA too.

This was reported in the following bz.

 https://bugzilla.kernel.org/show_bug.cgi?id=13029

Note for stable: 2.6.38 and later only

Cc: stable@kernel.org
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: David Zeuthen &lt;zeuthen@gmail.com&gt;
Reported-by: Martin Pitt &lt;martin.pitt@ubuntu.com&gt;
Reported-by: Kay Sievers &lt;kay.sievers@vrfy.org&gt;
Tested-by: Kay Sievers &lt;kay.sievers@vrfy.org&gt;
Cc: Alan Cox &lt;alan@lxorguk.ukuu.org.uk&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
</entry>
<entry>
<title>Fix common misspellings</title>
<updated>2011-03-31T14:26:23Z</updated>
<author>
<name>Lucas De Marchi</name>
<email>lucas.demarchi@profusion.mobi</email>
</author>
<published>2011-03-31T01:57:33Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=25985edcedea6396277003854657b5f3cb31a628'/>
<id>urn:sha1:25985edcedea6396277003854657b5f3cb31a628</id>
<content type='text'>
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi &lt;lucas.demarchi@profusion.mobi&gt;
</content>
</entry>
</feed>
