<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/md, branch v3.4.11</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/drivers/md?h=v3.4.11</id>
<link rel='self' href='https://git.amat.us/linux/atom/drivers/md?h=v3.4.11'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2012-08-15T15:10:08Z</updated>
<entry>
<title>md/raid1: don't abort a resync on the first badblock.</title>
<updated>2012-08-15T15:10:08Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2012-07-31T00:05:34Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=fb060f3d95d29c385063151c8a09faa4d1a02f2f'/>
<id>urn:sha1:fb060f3d95d29c385063151c8a09faa4d1a02f2f</id>
<content type='text'>
commit b7219ccb33aa0df9949a60c68b5e9f712615e56f upstream.

If a resync of a RAID1 array with 2 devices finds a known bad block
one device it will neither read from, or write to, that device for
this block offset.
So there will be one read_target (The other device) and zero write
targets.
This condition causes md/raid1 to abort the resync assuming that it
has finished - without known bad blocks this would be true.

When there are no write targets because of the presence of bad blocks
we should only skip over the area covered by the bad block.
RAID10 already gets this right, raid1 doesn't.  Or didn't.

As this can cause a 'sync' to abort early and appear to have succeeded
it could lead to some data corruption, so it suitable for -stable.

Reported-by: Alexander Lyakas &lt;alex.bolshoy@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm thin: fix memory leak in process_prepared_mapping error paths</title>
<updated>2012-08-09T15:31:40Z</updated>
<author>
<name>Joe Thornber</name>
<email>ejt@redhat.com</email>
</author>
<published>2012-07-27T14:08:05Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=e73b09d8f258df200adb6383c77c38c31f6f4baa'/>
<id>urn:sha1:e73b09d8f258df200adb6383c77c38c31f6f4baa</id>
<content type='text'>
commit 905386f82d08f66726912f303f3e6605248c60a3 upstream.

Fix memory leak in process_prepared_mapping by always freeing
the dm_thin_new_mapping structs from the mapping_pool mempool on
the error paths.

Signed-off-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm thin: reduce endio_hook pool size</title>
<updated>2012-08-09T15:31:40Z</updated>
<author>
<name>Alasdair G Kergon</name>
<email>agk@redhat.com</email>
</author>
<published>2012-07-27T14:07:57Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=b4ce163953a4baab2341ba42386904a7199d11af'/>
<id>urn:sha1:b4ce163953a4baab2341ba42386904a7199d11af</id>
<content type='text'>
commit 7768ed33ccdc02801c4483fc5682dc66ace14aea upstream.

Reduce the slab size used for the dm_thin_endio_hook mempool.

Allocation has been seen to fail on machines with smaller amounts
of memory due to fragmentation.

  lvm: page allocation failure. order:5, mode:0xd0
  device-mapper: table: 253:38: thin-pool: Error creating pool's endio_hook mempool

Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm raid1: set discard_zeroes_data_unsupported</title>
<updated>2012-07-29T15:04:21Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2012-07-20T13:25:07Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=5a4db9ee4f44658077a11c71d78a17573016fc0e'/>
<id>urn:sha1:5a4db9ee4f44658077a11c71d78a17573016fc0e</id>
<content type='text'>
commit 7c8d3a42fe1c58a7e8fd3f6a013e7d7b474ff931 upstream.

We can't guarantee that REQ_DISCARD on dm-mirror zeroes the data even if
the underlying disks support zero on discard.  So this patch sets
ti-&gt;discard_zeroes_data_unsupported.

For example, if the mirror is in the process of resynchronizing, it may
happen that kcopyd reads a piece of data, then discard is sent on the
same area and then kcopyd writes the piece of data to another leg.
Consequently, the data is not zeroed.

The flag was made available by commit 983c7db347db8ce2d8453fd1d89b7a4bb6920d56
(dm crypt: always disable discard_zeroes_data).

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm raid1: fix crash with mirror recovery and discard</title>
<updated>2012-07-29T15:04:21Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2012-07-20T13:25:03Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=91aafba4414743c24c6d06ccca75113e4abd13a8'/>
<id>urn:sha1:91aafba4414743c24c6d06ccca75113e4abd13a8</id>
<content type='text'>
commit 751f188dd5ab95b3f2b5f2f467c38aae5a2877eb upstream.

This patch fixes a crash when a discard request is sent during mirror
recovery.

Firstly, some background.  Generally, the following sequence happens during
mirror synchronization:
- function do_recovery is called
- do_recovery calls dm_rh_recovery_prepare
- dm_rh_recovery_prepare uses a semaphore to limit the number
  simultaneously recovered regions (by default the semaphore value is 1,
  so only one region at a time is recovered)
- dm_rh_recovery_prepare calls __rh_recovery_prepare,
  __rh_recovery_prepare asks the log driver for the next region to
  recover. Then, it sets the region state to DM_RH_RECOVERING. If there
  are no pending I/Os on this region, the region is added to
  quiesced_regions list. If there are pending I/Os, the region is not
  added to any list. It is added to the quiesced_regions list later (by
  dm_rh_dec function) when all I/Os finish.
- when the region is on quiesced_regions list, there are no I/Os in
  flight on this region. The region is popped from the list in
  dm_rh_recovery_start function. Then, a kcopyd job is started in the
  recover function.
- when the kcopyd job finishes, recovery_complete is called. It calls
  dm_rh_recovery_end. dm_rh_recovery_end adds the region to
  recovered_regions or failed_recovered_regions list (depending on
  whether the copy operation was successful or not).

The above mechanism assumes that if the region is in DM_RH_RECOVERING
state, no new I/Os are started on this region. When I/O is started,
dm_rh_inc_pending is called, which increases reg-&gt;pending count. When
I/O is finished, dm_rh_dec is called. It decreases reg-&gt;pending count.
If the count is zero and the region was in DM_RH_RECOVERING state,
dm_rh_dec adds it to the quiesced_regions list.

Consequently, if we call dm_rh_inc_pending/dm_rh_dec while the region is
in DM_RH_RECOVERING state, it could be added to quiesced_regions list
multiple times or it could be added to this list when kcopyd is copying
data (it is assumed that the region is not on any list while kcopyd does
its jobs). This results in memory corruption and crash.

There already exist bypasses for REQ_FLUSH requests: REQ_FLUSH requests
do not belong to any region, so they are always added to the sync list
in do_writes. dm_rh_inc_pending does not increase count for REQ_FLUSH
requests. In mirror_end_io, dm_rh_dec is never called for REQ_FLUSH
requests. These bypasses avoid the crash possibility described above.

These bypasses were improperly implemented for REQ_DISCARD when
the mirror target gained discard support in commit
5fc2ffeabb9ee0fc0e71ff16b49f34f0ed3d05b4 (dm raid1: support discard).

In do_writes, REQ_DISCARD requests is always added to the sync queue and
immediately dispatched (even if the region is in DM_RH_RECOVERING).  However,
dm_rh_inc and dm_rh_dec is called for REQ_DISCARD resusts.  So it violates the
rule that no I/Os are started on DM_RH_RECOVERING regions, and causes the list
corruption described above.

This patch changes it so that REQ_DISCARD requests follow the same path
as REQ_FLUSH. This avoids the crash.

Reference: https://bugzilla.redhat.com/837607

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm thin: do not send discards to shared blocks</title>
<updated>2012-07-29T15:04:20Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2012-07-20T13:25:05Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=5b8bbc39d5678179f2fd4ee2e09005d8f277834c'/>
<id>urn:sha1:5b8bbc39d5678179f2fd4ee2e09005d8f277834c</id>
<content type='text'>
commit 650d2a06b4fe1cc1d218c20e256650f68bf0ca31 upstream.

When process_discard receives a partial discard that doesn't cover a
full block, it sends this discard down to that block. Unfortunately, the
block can be shared and the discard would corrupt the other snapshots
sharing this block.

This patch detects block sharing and ends the discard with success when
sending it to the shared block.

The above change means that if the device supports discard it can't be
guaranteed that a discard request zeroes data. Therefore, we set
ti-&gt;discard_zeroes_data_unsupported.

Thin target discard support with this bug arrived in commit
104655fd4dcebd50068ef30253a001da72e3a081 (dm thin: support discards).

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Alasdair G Kergon &lt;agk@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md/raid1: close some possible races on write errors during resync</title>
<updated>2012-07-29T15:04:17Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2012-07-19T05:59:18Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d8ae4bb4a1c12f9cfb373f215e624a9f1fca0767'/>
<id>urn:sha1:d8ae4bb4a1c12f9cfb373f215e624a9f1fca0767</id>
<content type='text'>
commit 58e94ae18478c08229626daece2fc108a4a23261 upstream.

commit 4367af556133723d0f443e14ca8170d9447317cb
   md/raid1: clear bad-block record when write succeeds.

Added a 'reschedule_retry' call possibility at the end of
end_sync_write, but didn't add matching code at the end of
sync_request_write.  So if the writes complete very quickly, or
scheduling makes it seem that way, then we can miss rescheduling
the request and the resync could hang.

Also commit 73d5c38a9536142e062c35997b044e89166e063b
    md: avoid races when stopping resync.

Fix a race condition in this same code in end_sync_write but didn't
make the change in sync_request_write.

This patch updates sync_request_write to fix both of those.
Patch is suitable for 3.1 and later kernels.

Reported-by: Alexander Lyakas &lt;alex.bolshoy@gmail.com&gt;
Original-version-by: Alexander Lyakas &lt;alex.bolshoy@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md: avoid crash when stopping md array races with closing other open fds.</title>
<updated>2012-07-29T15:04:17Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2012-07-19T05:59:18Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=2ac8a0f58a8782ff8024404a579eee260e8b2010'/>
<id>urn:sha1:2ac8a0f58a8782ff8024404a579eee260e8b2010</id>
<content type='text'>
commit a05b7ea03d72f36edb0cec05e8893803335c61a0 upstream.

md will refuse to stop an array if any other fd (or mounted fs) is
using it.
When any fs is unmounted of when the last open fd is closed all
pending IO will be flushed (e.g. sync_blockdev call in __blkdev_put)
so there will be no pending IO to worry about when the array is
stopped.

However in order to send the STOP_ARRAY ioctl to stop the array one
must first get and open fd on the block device.
If some fd is being used to write to the block device and it is closed
after mdadm open the block device, but before mdadm issues the
STOP_ARRAY ioctl, then there will be no last-close on the md device so
__blkdev_put will not call sync_blockdev.

If this happens, then IO can still be in-flight while md tears down
the array and bad things can happen (use-after-free and subsequent
havoc).

So in the case where do_md_stop is being called from an open file
descriptor, call sync_block after taking the mutex to ensure there
will be no new openers.

This is needed when setting a read-write device to read-only too.

Reported-by: majianpeng &lt;majianpeng@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md/raid1: fix use-after-free bug in RAID1 data-check code.</title>
<updated>2012-07-19T15:58:55Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2012-07-09T01:34:13Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=e301f7b1dbbb0af6dcb8ed10c800fa95bb34917a'/>
<id>urn:sha1:e301f7b1dbbb0af6dcb8ed10c800fa95bb34917a</id>
<content type='text'>
commit 2d4f4f3384d4ef4f7c571448e803a1ce721113d5 upstream.

This bug has been present ever since data-check was introduce
in 2.6.16.  However it would only fire if a data-check were
done on a degraded array, which was only possible if the array
has 3 or more devices.  This is certainly possible, but is quite
uncommon.

Since hot-replace was added in 3.3 it can happen more often as
the same condition can arise if not all possible replacements are
present.

The problem is that as soon as we submit the last read request, the
'r1_bio' structure could be freed at any time, so we really should
stop looking at it.  If the last device is being read from we will
stop looking at it.  However if the last device is not due to be read
from, we will still check the bio pointer in the r1_bio, but the
r1_bio might already be free.

So use the read_targets counter to make sure we stop looking for bios
to submit as soon as we have submitted them all.

This fix is suitable for any -stable kernel since 2.6.16.

Reported-by: Arnold Schulz &lt;arnysch@gmx.net&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md/raid5: Do not add data_offset before call to is_badblock</title>
<updated>2012-07-16T16:04:43Z</updated>
<author>
<name>majianpeng</name>
<email>majianpeng@gmail.com</email>
</author>
<published>2012-06-12T00:31:10Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d3f940223c31cffd977864cdc13c69e89f03ce55'/>
<id>urn:sha1:d3f940223c31cffd977864cdc13c69e89f03ce55</id>
<content type='text'>
commit 6c0544e255dd6582a9899572e120fb55d9f672a4 upstream.

In chunk_aligned_read() we are adding data_offset before calling
is_badblock.  But is_badblock also adds data_offset, so that is bad.

So move the addition of data_offset to after the call to
is_badblock.

This bug was introduced by commit 31c176ecdf3563140e639
     md/raid5: avoid reading from known bad blocks.
which first appeared in 3.0.  So that patch is suitable for any
-stable kernel from 3.0.y onwards.  However it will need minor
revision for most of those (as the comment didn't appear until
recently).

Signed-off-by: majianpeng &lt;majianpeng@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
[bwh: Backported to 3.2: ignored missing comment]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
