<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/md, branch v3.4.73</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/drivers/md?h=v3.4.73</id>
<link rel='self' href='https://git.amat.us/linux/atom/drivers/md?h=v3.4.73'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2013-12-08T15:29:43Z</updated>
<entry>
<title>dm: fix truncated status strings</title>
<updated>2013-12-08T15:29:43Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2013-03-01T22:45:44Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d8b8a43e0f3c99bb29f258ef508969793f8e43bd'/>
<id>urn:sha1:d8b8a43e0f3c99bb29f258ef508969793f8e43bd</id>
<content type='text'>
commit fd7c092e711ebab55b2688d3859d95dfd0301f73 upstream.

Avoid returning a truncated table or status string instead of setting
the DM_BUFFER_FULL_FLAG when the last target of a table fills the
buffer.

When processing a table or status request, the function retrieve_status
calls ti-&gt;type-&gt;status. If ti-&gt;type-&gt;status returns non-zero,
retrieve_status assumes that the buffer overflowed and sets
DM_BUFFER_FULL_FLAG.

However, targets don't return non-zero values from their status method
on overflow. Most targets returns always zero.

If a buffer overflow happens in a target that is not the last in the
table, it gets noticed during the next iteration of the loop in
retrieve_status; but if a buffer overflow happens in the last target, it
goes unnoticed and erroneously truncated data is returned.

In the current code, the targets behave in the following way:
* dm-crypt returns -ENOMEM if there is not enough space to store the
  key, but it returns 0 on all other overflows.
* dm-thin returns errors from the status method if a disk error happened.
  This is incorrect because retrieve_status doesn't check the error
  code, it assumes that all non-zero values mean buffer overflow.
* all the other targets always return 0.

This patch changes the ti-&gt;type-&gt;status function to return void (because
most targets don't use the return code). Overflow is detected in
retrieve_status: if the status method fills up the remaining space
completely, it is assumed that buffer overflow happened.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md: fix calculation of stacking limits on level change.</title>
<updated>2013-12-04T18:50:33Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2013-11-14T04:16:15Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=f69a26d2cff8533570ca3ef3c59f2e1174b084b1'/>
<id>urn:sha1:f69a26d2cff8533570ca3ef3c59f2e1174b084b1</id>
<content type='text'>
commit 02e5f5c0a0f726e66e3d8506ea1691e344277969 upstream.

The various -&gt;run routines of md personalities assume that the 'queue'
has been initialised by the blk_set_stacking_limits() call in
md_alloc().

However when the level is changed (by level_store()) the -&gt;run routine
for the new level is called for an array which has already had the
stacking limits modified.  This can result in incorrect final
settings.

So call blk_set_stacking_limits() before -&gt;run in level_store().

A specific consequence of this bug is that it causes
discard_granularity to be set incorrectly when reshaping a RAID4 to a
RAID0.

This is suitable for any -stable kernel since 3.3 in which
blk_set_stacking_limits() was introduced.

Reported-and-tested-by: "Baldysiak, Pawel" &lt;pawel.baldysiak@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm: allocate buffer for messages with small number of arguments using GFP_NOIO</title>
<updated>2013-12-04T18:50:31Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2013-10-31T17:55:45Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=3ccb527f871a225b8eada036d24d41c59e40a2e4'/>
<id>urn:sha1:3ccb527f871a225b8eada036d24d41c59e40a2e4</id>
<content type='text'>
commit f36afb3957353d2529cb2b00f78fdccd14fc5e9c upstream.

dm-mpath and dm-thin must process messages even if some device is
suspended, so we allocate argv buffer with GFP_NOIO. These messages have
a small fixed number of arguments.

On the other hand, dm-switch needs to process bulk data using messages
so excessive use of GFP_NOIO could cause trouble.

The patch also lowers the default number of arguments from 64 to 8, so
that there is smaller load on GFP_NOIO allocations.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Acked-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md: Fix skipping recovery for read-only arrays.</title>
<updated>2013-11-13T03:01:48Z</updated>
<author>
<name>Lukasz Dorau</name>
<email>lukasz.dorau@intel.com</email>
</author>
<published>2013-10-24T01:55:17Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=268417aeea96a1941c2cd14f0f8b5b39c374ad25'/>
<id>urn:sha1:268417aeea96a1941c2cd14f0f8b5b39c374ad25</id>
<content type='text'>
commit 61e4947c99c4494336254ec540c50186d186150b upstream.

Since:
        commit 7ceb17e87bde79d285a8b988cfed9eaeebe60b86
        md: Allow devices to be re-added to a read-only array.

spares are activated on a read-only array. In case of raid1 and raid10
personalities it causes that not-in-sync devices are marked in-sync
without checking if recovery has been finished.

If a read-only array is degraded and one of its devices is not in-sync
(because the array has been only partially recovered) recovery will be skipped.

This patch adds checking if recovery has been finished before marking a device
in-sync for raid1 and raid10 personalities. In case of raid5 personality
such condition is already present (at raid5.c:6029).

Bug was introduced in 3.10 and causes data corruption.

Signed-off-by: Pawel Baldysiak &lt;pawel.baldysiak@intel.com&gt;
Signed-off-by: Lukasz Dorau &lt;lukasz.dorau@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm snapshot: fix data corruption</title>
<updated>2013-11-04T12:23:42Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2013-10-16T02:17:47Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=df6516ade182c732e3d2691e0b60190f7abc1261'/>
<id>urn:sha1:df6516ade182c732e3d2691e0b60190f7abc1261</id>
<content type='text'>
commit e9c6a182649f4259db704ae15a91ac820e63b0ca upstream.

This patch fixes a particular type of data corruption that has been
encountered when loading a snapshot's metadata from disk.

When we allocate a new chunk in persistent_prepare, we increment
ps-&gt;next_free and we make sure that it doesn't point to a metadata area
by further incrementing it if necessary.

When we load metadata from disk on device activation, ps-&gt;next_free is
positioned after the last used data chunk. However, if this last used
data chunk is followed by a metadata area, ps-&gt;next_free is positioned
erroneously to the metadata area. A newly-allocated chunk is placed at
the same location as the metadata area, resulting in data or metadata
corruption.

This patch changes the code so that ps-&gt;next_free skips the metadata
area when metadata are loaded in function read_exceptions.

The patch also moves a piece of code from persistent_prepare_exception
to a separate function skip_metadata to avoid code duplication.

CVE-2013-4299

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Cc: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm-snapshot: fix performance degradation due to small hash size</title>
<updated>2013-10-05T14:06:54Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2013-09-18T23:40:42Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=182410bd95348ef1ee5b1bccff8c2c75cec72b44'/>
<id>urn:sha1:182410bd95348ef1ee5b1bccff8c2c75cec72b44</id>
<content type='text'>
commit 60e356f381954d79088d0455e357db48cfdd6857 upstream.

LVM2, since version 2.02.96, creates origin with zero size, then loads
the snapshot driver and then loads the origin.  Consequently, the
snapshot driver sees the origin size zero and sets the hash size to the
lower bound 64.  Such small hash table causes performance degradation.

This patch changes it so that the hash size is determined by the size of
snapshot volume, not minimum of origin and snapshot size.  It doesn't
make sense to set the snapshot size significantly larger than the origin
size, so we do not need to take origin size into account when
calculating the hash size.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm snapshot: workaround for a false positive lockdep warning</title>
<updated>2013-10-05T14:06:54Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2013-09-18T23:14:22Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=594eaa187a86da35487f59b9f2913e0f28fc5826'/>
<id>urn:sha1:594eaa187a86da35487f59b9f2913e0f28fc5826</id>
<content type='text'>
commit 5ea330a75bd86b2b2a01d7b85c516983238306fb upstream.

The kernel reports a lockdep warning if a snapshot is invalidated because
it runs out of space.

The lockdep warning was triggered by commit 0976dfc1d0cd80a4e9dfaf87bd87
("workqueue: Catch more locking problems with flush_work()") in v3.5.

The warning is false positive.  The real cause for the warning is that
the lockdep engine treats different instances of md-&gt;lock as a single
lock.

This patch is a workaround - we use flush_workqueue instead of flush_work.
This code path is not performance sensitive (it is called only on
initialization or invalidation), thus it doesn't matter that we flush the
whole workqueue.

The real fix for the problem would be to teach the lockdep engine to treat
different instances of md-&gt;lock as separate locks.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Acked-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md/raid1,raid10: use freeze_array in place of raise_barrier in various places.</title>
<updated>2013-08-20T15:26:28Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2013-06-12T01:01:22Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=1b9203bb4c658c0242afa6fdb025c71d2fc3ad76'/>
<id>urn:sha1:1b9203bb4c658c0242afa6fdb025c71d2fc3ad76</id>
<content type='text'>
commit e2d59925221cd562e07fee38ec8839f7209ae603 upstream.

Various places in raid1 and raid10 are calling raise_barrier when they
really should call freeze_array.
The former is only intended to be called from "make_request".
The later has extra checks for 'nr_queued' and makes a call to
flush_pending_writes(), so it is safe to call it from within the
management thread.

Using raise_barrier will sometimes deadlock.  Using freeze_array
should not.

As 'freeze_array' currently expects one request to be pending (in
handle_read_error - the only previous caller), we need to pass
it the number of pending requests (extra) to ignore.

The deadlock was made particularly noticeable by commits
050b66152f87c7 (raid10) and 6b740b8d79252f13 (raid1) which
appeared in 3.4, so the fix is appropriate for any -stable
kernel since then.

This patch probably won't apply directly to some early kernels and
will need to be applied by hand.

Cc: stable@vger.kernel.org
Reported-by: Alexander Lyakas &lt;alex.bolshoy@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
[adjust context to make it can be apply on top of 3.4 ]
Signed-off-by: Jack Wang &lt;jinpu.wang@profitbricks.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md/raid10: remove use-after-free bug.</title>
<updated>2013-08-04T08:26:00Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2013-07-24T05:37:42Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=aa2f8abe27dd7058fd7bfe52401121037a285406'/>
<id>urn:sha1:aa2f8abe27dd7058fd7bfe52401121037a285406</id>
<content type='text'>
commit 0eb25bb027a100f5a9df8991f2f628e7d851bc1e upstream.

We always need to be careful when calling generic_make_request, as it
can start a chain of events which might free something that we are
using.

Here is one place I wasn't careful enough.  If the wbio2 is not in
use, then it might get freed at the first generic_make_request call.
So perform all necessary tests first.

This bug was introduced in 3.3-rc3 (24afd80d99) and can cause an
oops, so fix is suitable for any -stable since then.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md/raid5: fix interaction of 'replace' and 'recovery'.</title>
<updated>2013-08-04T08:26:00Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2013-07-22T02:57:21Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0761d079bbc23ffd309c586ddd142b3cb5f11e0d'/>
<id>urn:sha1:0761d079bbc23ffd309c586ddd142b3cb5f11e0d</id>
<content type='text'>
commit f94c0b6658c7edea8bc19d13be321e3860a3fa54 upstream.

If a device in a RAID4/5/6 is being replaced while another is being
recovered, then the writes to the replacement device currently don't
happen, resulting in corruption when the replacement completes and the
new drive takes over.

This is because the replacement writes are only triggered when
's.replacing' is set and not when the similar 's.sync' is set (which
is the case during resync and recovery - it means all devices need to
be read).

So schedule those writes when s.replacing is set as well.

In this case we cannot use "STRIPE_INSYNC" to record that the
replacement has happened as that is needed for recording that any
parity calculation is complete.  So introduce STRIPE_REPLACED to
record if the replacement has happened.

For safety we should also check that STRIPE_COMPUTE_RUN is not set.
This has a similar effect to the "s.locked == 0" test.  The latter
ensure that now IO has been flagged but not started.  The former
checks if any parity calculation has been flagged by not started.
We must wait for both of these to complete before triggering the
'replace'.

Add a similar test to the subsequent check for "are we finished yet".
This possibly isn't needed (is subsumed in the STRIPE_INSYNC test),
but it makes it more obvious that the REPLACE will happen before we
think we are finished.

Finally if a NeedReplace device is not UPTODATE then that is an
error.  We really must trigger a warning.

This bug was introduced in commit 9a3e1101b827a59ac9036a672f5fa8d5279d0fe2
(md/raid5:  detect and handle replacements during recovery.)
which introduced replacement for raid5.
That was in 3.3-rc3, so any stable kernel since then would benefit
from this fix.

Reported-by: qindehua &lt;13691222965@163.com&gt;
Tested-by: qindehua &lt;qindehua@163.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
