<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/md, branch v3.10.20</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/drivers/md?h=v3.10.20</id>
<link rel='self' href='https://git.amat.us/linux/atom/drivers/md?h=v3.10.20'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2013-11-13T03:05:32Z</updated>
<entry>
<title>md: Fix skipping recovery for read-only arrays.</title>
<updated>2013-11-13T03:05:32Z</updated>
<author>
<name>Lukasz Dorau</name>
<email>lukasz.dorau@intel.com</email>
</author>
<published>2013-10-24T01:55:17Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=ed840bec21c6f2f99ca34e974a5905e4f2116c1b'/>
<id>urn:sha1:ed840bec21c6f2f99ca34e974a5905e4f2116c1b</id>
<content type='text'>
commit 61e4947c99c4494336254ec540c50186d186150b upstream.

Since:
        commit 7ceb17e87bde79d285a8b988cfed9eaeebe60b86
        md: Allow devices to be re-added to a read-only array.

spares are activated on a read-only array. In case of raid1 and raid10
personalities it causes that not-in-sync devices are marked in-sync
without checking if recovery has been finished.

If a read-only array is degraded and one of its devices is not in-sync
(because the array has been only partially recovered) recovery will be skipped.

This patch adds checking if recovery has been finished before marking a device
in-sync for raid1 and raid10 personalities. In case of raid5 personality
such condition is already present (at raid5.c:6029).

Bug was introduced in 3.10 and causes data corruption.

Signed-off-by: Pawel Baldysiak &lt;pawel.baldysiak@intel.com&gt;
Signed-off-by: Lukasz Dorau &lt;lukasz.dorau@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md: avoid deadlock when md_set_badblocks.</title>
<updated>2013-11-13T03:05:32Z</updated>
<author>
<name>Bian Yu</name>
<email>bianyu@kedacom.com</email>
</author>
<published>2013-10-12T05:10:03Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0465496671f4769e0f4f00481ce5bc5598c5caa2'/>
<id>urn:sha1:0465496671f4769e0f4f00481ce5bc5598c5caa2</id>
<content type='text'>
commit 905b0297a9533d7a6ee00a01a990456636877dd6 upstream.

When operate harddisk and hit errors, md_set_badblocks is called after
scsi_restart_operations which already disabled the irq. but md_set_badblocks
will call write_sequnlock_irq and enable irq. so softirq can preempt the
current thread and that may cause a deadlock. I think this situation should
use write_sequnlock_irqsave/irqrestore instead.

I met the situation and the call trace is below:
[  638.919974] BUG: spinlock recursion on CPU#0, scsi_eh_13/1010
[  638.921923]  lock: 0xffff8800d4d51fc8, .magic: dead4ead, .owner: scsi_eh_13/1010, .owner_cpu: 0
[  638.923890] CPU: 0 PID: 1010 Comm: scsi_eh_13 Not tainted 3.12.0-rc5+ #37
[  638.925844] Hardware name: To be filled by O.E.M. To be filled by O.E.M./MAHOBAY, BIOS 4.6.5 03/05/2013
[  638.927816]  ffff880037ad4640 ffff880118c03d50 ffffffff8172ff85 0000000000000007
[  638.929829]  ffff8800d4d51fc8 ffff880118c03d70 ffffffff81730030 ffff8800d4d51fc8
[  638.931848]  ffffffff81a72eb0 ffff880118c03d90 ffffffff81730056 ffff8800d4d51fc8
[  638.933884] Call Trace:
[  638.935867]  &lt;IRQ&gt;  [&lt;ffffffff8172ff85&gt;] dump_stack+0x55/0x76
[  638.937878]  [&lt;ffffffff81730030&gt;] spin_dump+0x8a/0x8f
[  638.939861]  [&lt;ffffffff81730056&gt;] spin_bug+0x21/0x26
[  638.941836]  [&lt;ffffffff81336de4&gt;] do_raw_spin_lock+0xa4/0xc0
[  638.943801]  [&lt;ffffffff8173f036&gt;] _raw_spin_lock+0x66/0x80
[  638.945747]  [&lt;ffffffff814a73ed&gt;] ? scsi_device_unbusy+0x9d/0xd0
[  638.947672]  [&lt;ffffffff8173fb1b&gt;] ? _raw_spin_unlock+0x2b/0x50
[  638.949595]  [&lt;ffffffff814a73ed&gt;] scsi_device_unbusy+0x9d/0xd0
[  638.951504]  [&lt;ffffffff8149ec47&gt;] scsi_finish_command+0x37/0xe0
[  638.953388]  [&lt;ffffffff814a75e8&gt;] scsi_softirq_done+0xa8/0x140
[  638.955248]  [&lt;ffffffff8130e32b&gt;] blk_done_softirq+0x7b/0x90
[  638.957116]  [&lt;ffffffff8104fddd&gt;] __do_softirq+0xfd/0x330
[  638.958987]  [&lt;ffffffff810b964f&gt;] ? __lock_release+0x6f/0x100
[  638.960861]  [&lt;ffffffff8174a5cc&gt;] call_softirq+0x1c/0x30
[  638.962724]  [&lt;ffffffff81004c7d&gt;] do_softirq+0x8d/0xc0
[  638.964565]  [&lt;ffffffff8105024e&gt;] irq_exit+0x10e/0x150
[  638.966390]  [&lt;ffffffff8174ad4a&gt;] smp_apic_timer_interrupt+0x4a/0x60
[  638.968223]  [&lt;ffffffff817499af&gt;] apic_timer_interrupt+0x6f/0x80
[  638.970079]  &lt;EOI&gt;  [&lt;ffffffff810b964f&gt;] ? __lock_release+0x6f/0x100
[  638.971899]  [&lt;ffffffff8173fa6a&gt;] ? _raw_spin_unlock_irq+0x3a/0x50
[  638.973691]  [&lt;ffffffff8173fa60&gt;] ? _raw_spin_unlock_irq+0x30/0x50
[  638.975475]  [&lt;ffffffff81562393&gt;] md_set_badblocks+0x1f3/0x4a0
[  638.977243]  [&lt;ffffffff81566e07&gt;] rdev_set_badblocks+0x27/0x80
[  638.978988]  [&lt;ffffffffa00d97bb&gt;] raid5_end_read_request+0x36b/0x4e0 [raid456]
[  638.980723]  [&lt;ffffffff811b5a1d&gt;] bio_endio+0x1d/0x40
[  638.982463]  [&lt;ffffffff81304ff3&gt;] req_bio_endio.isra.65+0x83/0xa0
[  638.984214]  [&lt;ffffffff81306b9f&gt;] blk_update_request+0x7f/0x350
[  638.985967]  [&lt;ffffffff81306ea1&gt;] blk_update_bidi_request+0x31/0x90
[  638.987710]  [&lt;ffffffff813085e0&gt;] __blk_end_bidi_request+0x20/0x50
[  638.989439]  [&lt;ffffffff8130862f&gt;] __blk_end_request_all+0x1f/0x30
[  638.991149]  [&lt;ffffffff81308746&gt;] blk_peek_request+0x106/0x250
[  638.992861]  [&lt;ffffffff814a62a9&gt;] ? scsi_kill_request.isra.32+0xe9/0x130
[  638.994561]  [&lt;ffffffff814a633a&gt;] scsi_request_fn+0x4a/0x3d0
[  638.996251]  [&lt;ffffffff813040a7&gt;] __blk_run_queue+0x37/0x50
[  638.997900]  [&lt;ffffffff813045af&gt;] blk_run_queue+0x2f/0x50
[  638.999553]  [&lt;ffffffff814a5750&gt;] scsi_run_queue+0xe0/0x1c0
[  639.001185]  [&lt;ffffffff814a7721&gt;] scsi_run_host_queues+0x21/0x40
[  639.002798]  [&lt;ffffffff814a2e87&gt;] scsi_restart_operations+0x177/0x200
[  639.004391]  [&lt;ffffffff814a4fe9&gt;] scsi_error_handler+0xc9/0xe0
[  639.005996]  [&lt;ffffffff814a4f20&gt;] ? scsi_unjam_host+0xd0/0xd0
[  639.007600]  [&lt;ffffffff81072f6b&gt;] kthread+0xdb/0xe0
[  639.009205]  [&lt;ffffffff81072e90&gt;] ? flush_kthread_worker+0x170/0x170
[  639.010821]  [&lt;ffffffff81748cac&gt;] ret_from_fork+0x7c/0xb0
[  639.012437]  [&lt;ffffffff81072e90&gt;] ? flush_kthread_worker+0x170/0x170

This bug was introduce in commit  2e8ac30312973dd20e68073653
(the first time rdev_set_badblock was call from interrupt context),
so this patch is appropriate for 3.5 and subsequent kernels.

Signed-off-by: Bian Yu &lt;bianyu@kedacom.com&gt;
Reviewed-by: Jianpeng Ma &lt;majianpeng@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>raid5: avoid finding "discard" stripe</title>
<updated>2013-11-13T03:05:31Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@kernel.org</email>
</author>
<published>2013-10-19T06:51:42Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=01e608d7276508fcafb76f2092db89885e62ef66'/>
<id>urn:sha1:01e608d7276508fcafb76f2092db89885e62ef66</id>
<content type='text'>
commit d47648fcf0611812286f68131b40251c6fa54f5e upstream.

SCSI discard will damage discard stripe bio setting, eg, some fields are
changed. If the stripe is reused very soon, we have wrong bios setting. We
remove discard stripe from hash list, so next time the strip will be fully
initialized.

Suitable for backport to 3.7+.

Signed-off-by: Shaohua Li &lt;shli@fusionio.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>raid5: set bio bi_vcnt 0 for discard request</title>
<updated>2013-11-13T03:05:31Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@kernel.org</email>
</author>
<published>2013-10-19T06:50:28Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=7e44a92662ce582268c4f35e68aad1f632ada8f8'/>
<id>urn:sha1:7e44a92662ce582268c4f35e68aad1f632ada8f8</id>
<content type='text'>
commit 37c61ff31e9b5e3fcf3cc6579f5c68f6ad40c4b1 upstream.

SCSI layer will add new payload for discard request. If two bios are merged
to one, the second bio has bi_vcnt 1 which is set in raid5. This will confuse
SCSI and cause oops.

Suitable for backport to 3.7+

Reported-by: Jes Sorensen &lt;Jes.Sorensen@redhat.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fusionio.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Acked-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bcache: Fixed incorrect order of arguments to bio_alloc_bioset()</title>
<updated>2013-11-13T03:05:30Z</updated>
<author>
<name>Kent Overstreet</name>
<email>kmo@daterainc.com</email>
</author>
<published>2013-10-22T22:35:50Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=955a23e181561a792d6e4c1572848b7ba306499f'/>
<id>urn:sha1:955a23e181561a792d6e4c1572848b7ba306499f</id>
<content type='text'>
commit d4eddd42f592a0cf06818fae694a3d271f842e4d upstream.

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm snapshot: fix data corruption</title>
<updated>2013-11-04T12:31:06Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2013-10-16T02:17:47Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=2d99b6dd66b5778d92fc411b48037084528e1ae2'/>
<id>urn:sha1:2d99b6dd66b5778d92fc411b48037084528e1ae2</id>
<content type='text'>
commit e9c6a182649f4259db704ae15a91ac820e63b0ca upstream.

This patch fixes a particular type of data corruption that has been
encountered when loading a snapshot's metadata from disk.

When we allocate a new chunk in persistent_prepare, we increment
ps-&gt;next_free and we make sure that it doesn't point to a metadata area
by further incrementing it if necessary.

When we load metadata from disk on device activation, ps-&gt;next_free is
positioned after the last used data chunk. However, if this last used
data chunk is followed by a metadata area, ps-&gt;next_free is positioned
erroneously to the metadata area. A newly-allocated chunk is placed at
the same location as the metadata area, resulting in data or metadata
corruption.

This patch changes the code so that ps-&gt;next_free skips the metadata
area when metadata are loaded in function read_exceptions.

The patch also moves a piece of code from persistent_prepare_exception
to a separate function skip_metadata to avoid code duplication.

CVE-2013-4299

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bcache: Fix a null ptr deref regression</title>
<updated>2013-10-13T23:08:35Z</updated>
<author>
<name>Kent Overstreet</name>
<email>kmo@daterainc.com</email>
</author>
<published>2013-10-11T00:31:15Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=84c8b3b9e17107a74f07dc5e48264034f1410b97'/>
<id>urn:sha1:84c8b3b9e17107a74f07dc5e48264034f1410b97</id>
<content type='text'>
commit 2fe80d3bbf1c8bd9efc5b8154207c8dd104e7306 upstream.

Commit c0f04d88e46d ("bcache: Fix flushes in writeback mode") was fixing
a reported data corruption bug, but it seems some last minute
refactoring or rebasing introduced a null pointer deref.

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
Reported-by: Gabriel de Perthuis &lt;g2p.code@gmail.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm-raid: silence compiler warning on rebuilds_per_group.</title>
<updated>2013-10-05T14:13:11Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2013-05-09T00:27:49Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f38af5d3f6aa1186b0ac9a1ef021d425550b479b'/>
<id>urn:sha1:f38af5d3f6aa1186b0ac9a1ef021d425550b479b</id>
<content type='text'>
commit 3f6bbd3ffd7b733dd705e494663e5761aa2cb9c1 upstream.

This doesn't really need to be initialised, but it doesn't hurt,
silences the compiler, and as it is a counter it makes sense for it to
start at zero.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm mpath: disable WRITE SAME if it fails</title>
<updated>2013-10-05T14:13:11Z</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2013-09-19T16:13:58Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=e9d60f699108682bad9c4604feb408ac9198b232'/>
<id>urn:sha1:e9d60f699108682bad9c4604feb408ac9198b232</id>
<content type='text'>
commit f84cb8a46a771f36a04a02c61ea635c968ed5f6a upstream.

Workaround the SCSI layer's problematic WRITE SAME heuristics by
disabling WRITE SAME in the DM multipath device's queue_limits if an
underlying device disabled it.

The WRITE SAME heuristics, with both the original commit 5db44863b6eb
("[SCSI] sd: Implement support for WRITE SAME") and the updated commit
66c28f971 ("[SCSI] sd: Update WRITE SAME heuristics"), default to enabling
WRITE SAME(10) even without successfully determining it is supported.
After the first failed WRITE SAME the SCSI layer will disable WRITE SAME
for the device (by setting sdkp-&gt;device-&gt;no_write_same which results in
'max_write_same_sectors' in device's queue_limits to be set to 0).

When a device is stacked ontop of such a SCSI device any changes to that
SCSI device's queue_limits do not automatically propagate up the stack.
As such, a DM multipath device will not have its WRITE SAME support
disabled.  This causes the block layer to continue to issue WRITE SAME
requests to the mpath device which causes paths to fail and (if mpath IO
isn't configured to queue when no paths are available) it will result in
actual IO errors to the upper layers.

This fix doesn't help configurations that have additional devices
stacked ontop of the mpath device (e.g. LVM created linear DM devices
ontop).  A proper fix that restacks all the queue_limits from the bottom
of the device stack up will need to be explored if SCSI will continue to
use this model of optimistically allowing op codes and then disabling
them after they fail for the first time.

Before this patch:

EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121)
device-mapper: multipath: XXX snitm debugging: failing WRITE SAME IO with error=-121
end_request: critical target error, dev dm-6, sector 528
dm-6: WRITE SAME failed. Manually zeroing.
device-mapper: multipath: Failing path 8:112.
end_request: I/O error, dev dm-6, sector 4616
dm-6: WRITE SAME failed. Manually zeroing.
end_request: I/O error, dev dm-6, sector 4616
end_request: I/O error, dev dm-6, sector 5640
end_request: I/O error, dev dm-6, sector 6664
end_request: I/O error, dev dm-6, sector 7688
end_request: I/O error, dev dm-6, sector 524288
Buffer I/O error on device dm-6, logical block 65536
lost page write due to I/O error on dm-6
JBD2: Error -5 detected when updating journal superblock for dm-6-8.
end_request: I/O error, dev dm-6, sector 524296
Aborting journal on device dm-6-8.
end_request: I/O error, dev dm-6, sector 524288
Buffer I/O error on device dm-6, logical block 65536
lost page write due to I/O error on dm-6
JBD2: Error -5 detected when updating journal superblock for dm-6-8.

# cat /sys/block/sdh/queue/write_same_max_bytes
0
# cat /sys/block/dm-6/queue/write_same_max_bytes
33553920

After this patch:

EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121)
device-mapper: multipath: XXX snitm debugging: WRITE SAME I/O failed with error=-121
end_request: critical target error, dev dm-6, sector 528
dm-6: WRITE SAME failed. Manually zeroing.

# cat /sys/block/sdh/queue/write_same_max_bytes
0
# cat /sys/block/dm-6/queue/write_same_max_bytes
0

It should be noted that WRITE SAME support wasn't enabled in DM
multipath until v3.10.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Cc: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Cc: Hannes Reinecke &lt;hare@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm-snapshot: fix performance degradation due to small hash size</title>
<updated>2013-10-05T14:13:11Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2013-09-18T23:40:42Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0f64fad39c9577f3eaa26b45a9ad774c415c19ff'/>
<id>urn:sha1:0f64fad39c9577f3eaa26b45a9ad774c415c19ff</id>
<content type='text'>
commit 60e356f381954d79088d0455e357db48cfdd6857 upstream.

LVM2, since version 2.02.96, creates origin with zero size, then loads
the snapshot driver and then loads the origin.  Consequently, the
snapshot driver sees the origin size zero and sets the hash size to the
lower bound 64.  Such small hash table causes performance degradation.

This patch changes it so that the hash size is determined by the size of
snapshot volume, not minimum of origin and snapshot size.  It doesn't
make sense to set the snapshot size significantly larger than the origin
size, so we do not need to take origin size into account when
calculating the hash size.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
