<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs, branch v2.6.24.5</title>
<subtitle>Linux kernel source tree</subtitle>
<id>https://git.amat.us/linux/atom/fs?h=v2.6.24.5</id>
<link rel='self' href='https://git.amat.us/linux/atom/fs?h=v2.6.24.5'/>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/'/>
<updated>2008-04-19T01:53:30Z</updated>
<entry>
<title>locks: fix possible infinite loop in fcntl(F_SETLKW) over nfs</title>
<updated>2008-04-19T01:53:30Z</updated>
<author>
<name>J. Bruce Fields</name>
<email>bfields@citi.umich.edu</email>
</author>
<published>2008-04-14T19:03:02Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=fa4bf970097e80c3ba50467a8b99c8f97a6391f0'/>
<id>urn:sha1:fa4bf970097e80c3ba50467a8b99c8f97a6391f0</id>
<content type='text'>
upstream commit: 19e729a928172103e101ffd0829fd13e68c13f78

Miklos Szeredi found the bug:

	"Basically what happens is that on the server nlm_fopen() calls
	nfsd_open() which returns -EACCES, to which nlm_fopen() returns
	NLM_LCK_DENIED.

	"On the client this will turn into a -EAGAIN (nlm_stat_to_errno()),
	which in will cause fcntl_setlk() to retry forever."

So, for example, opening a file on an nfs filesystem, changing
permissions to forbid further access, then trying to lock the file,
could result in an infinite loop.

And Trond Myklebust identified the culprit, from Marc Eshel and I:

	7723ec9777d9832849b76475b1a21a2872a40d20 "locks: factor out
	generic/filesystem switch from setlock code"

That commit claimed to just be reshuffling code, but actually introduced
a behavioral change by calling the lock method repeatedly as long as it
returned -EAGAIN.

We assumed this would be safe, since we assumed a lock of type SETLKW
would only return with either success or an error other than -EAGAIN.
However, nfs does can in fact return -EAGAIN in this situation, and
independently of whether that behavior is correct or not, we don't
actually need this change, and it seems far safer not to depend on such
assumptions about the filesystem's -&gt;lock method.

Therefore, revert the problematic part of the original commit.  This
leaves vfs_lock_file() and its other callers unchanged, while returning
fcntl_setlk and fcntl_setlk64 to their former behavior.

Signed-off-by: J. Bruce Fields &lt;bfields@citi.umich.edu&gt;
Tested-by: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Cc: Trond Myklebust &lt;trond.myklebust@fys.uio.no&gt;
Cc: Marc Eshel &lt;eshel@almaden.ibm.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
</entry>
<entry>
<title>signalfd: fix for incorrect SI_QUEUE user data reporting</title>
<updated>2008-04-19T01:53:29Z</updated>
<author>
<name>Davide Libenzi</name>
<email>davidel@xmailserver.org</email>
</author>
<published>2008-04-11T16:55:04Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=c9c5091171cbf780bb293e6406dd8632b678bae8'/>
<id>urn:sha1:c9c5091171cbf780bb293e6406dd8632b678bae8</id>
<content type='text'>
upstream commit: 0859ab59a8a48d2a96b9d2b7100889bcb6bb5818

Michael Kerrisk found out that signalfd was not reporting back user data
pushed using sigqueue:

  http://groups.google.com/group/linux.kernel/msg/9397cab8551e3123

The following patch makes signalfd report back the ssi_ptr and ssi_int members
of the signalfd_siginfo structure.

Signed-off-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Acked-by: Michael Kerrisk &lt;mtk.manpages@googlemail.com&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
</entry>
<entry>
<title>HFS+: fix unlink of links</title>
<updated>2008-04-19T01:53:28Z</updated>
<author>
<name>Roman Zippel</name>
<email>zippel@linux-m68k.org</email>
</author>
<published>2008-04-09T15:44:07Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=b1c9cdea40bcedd6ab88de759162ef01d7b50789'/>
<id>urn:sha1:b1c9cdea40bcedd6ab88de759162ef01d7b50789</id>
<content type='text'>
upstream commit: 76b0c26af2736b7e5b87e6ed7ab63901483d5736

Some time ago while attempting to handle invalid link counts, I botched 
the unlink of links itself, so this patch fixes this now correctly, so 
that only the link count of nodes that don't point to links is ignored.
Thanks to Vlado Plaga &lt;rechner@vlado-do.de&gt; to notify me of this 
problem.

Signed-off-by: Roman Zippel &lt;zippel@linux-m68k.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
</entry>
<entry>
<title>vfs: fix data leak in nobh_write_end()</title>
<updated>2008-04-19T01:53:21Z</updated>
<author>
<name>Dmitri Monakhov</name>
<email>dmonakhov@openvz.org</email>
</author>
<published>2008-03-28T22:10:07Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=6ed609bc70e1cce983650707b0b7c12265ab96f6'/>
<id>urn:sha1:6ed609bc70e1cce983650707b0b7c12265ab96f6</id>
<content type='text'>
upstream commit: 5b41e74ad1b0bf7bc51765ae74e5dc564afc3e48

Current nobh_write_end() implementation ignore partial writes(copied &lt; len)
case if page was fully mapped and simply mark page as Uptodate, which is
totally wrong because area [pos+copied, pos+len) wasn't updated explicitly in
previous write_begin call.  It simply contains garbage from pagecache and
result in data leakage.

#TEST_CASE_BEGIN:
~~~~~~~~~~~~~~~~
In fact issue triggered by classical testcase
	open("/mnt/test", O_RDWR|O_CREAT|O_TRUNC, 0666) = 3
	ftruncate(3, 409600)                    = 0
	writev(3, [{"a", 1}, {NULL, 4095}], 2)  = 1
##TESTCASE_SOURCE:
~~~~~~~~~~~~~~~~~
#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;fcntl.h&gt;
#include &lt;sys/uio.h&gt;
#include &lt;sys/mman.h&gt;
#include &lt;errno.h&gt;
int main(int argc, char **argv)
{
	int fd,  ret;
	void* p;
	struct iovec iov[2];
	fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC, 0666);
	ftruncate(fd, 409600);
	iov[0].iov_base="a";
	iov[0].iov_len=1;
	iov[1].iov_base=NULL;
	iov[1].iov_len=4096;
	ret = writev(fd, iov, sizeof(iov)/sizeof(struct iovec));
	printf("writev  = %d, err = %d\n", ret, errno);
	return 0;
}
##TESTCASE RESULT:
~~~~~~~~~~~~~~~~~~
[root@ts63 ~]# mount | grep mnt2
/dev/mapper/test on /mnt2 type ext2 (rw,nobh)
[root@ts63 ~]#  /tmp/writev /mnt2/test
writev  = 1, err = 0
[root@ts63 ~]# hexdump -C /mnt2/test

00000000  61 65 62 6f 6f 74 00 00  f0 b9 b4 59 3a 00 00 00  |aeboot.....Y:...|
00000010  20 00 00 00 00 00 00 00  21 00 00 00 00 00 00 00  | .......!.......|
00000020  df df df df df df df df  df df df df df df df df  |................|
00000030  3a 00 00 00 2a 00 00 00  21 00 00 00 00 00 00 00  |:...*...!.......|
00000040  60 c0 8c 00 00 00 00 00  40 4a 8d 00 00 00 00 00  |`.......@J......|
00000050  00 00 00 00 00 00 00 00  41 00 00 00 00 00 00 00  |........A.......|
00000060  74 69 6d 65 20 64 64 20  69 66 3d 2f 64 65 76 2f  |time dd if=/dev/|
00000070  6c 6f 6f 70 30 20 20 6f  66 3d 2f 64 65 76 2f 6e  |loop0  of=/dev/n|
skip..
00000f50  00 00 00 00 00 00 00 00  31 00 00 00 00 00 00 00  |........1.......|
00000f60  6d 6b 66 73 2e 65 78 74  33 20 2f 64 65 76 2f 76  |mkfs.ext3 /dev/v|
00000f70  7a 76 67 2f 74 65 73 74  20 2d 62 34 30 39 36 00  |zvg/test -b4096.|
00000f80  a0 fe 8c 00 00 00 00 00  21 00 00 00 00 00 00 00  |........!.......|
00000f90  23 31 32 30 35 39 35 30  34 30 34 00 3a 00 00 00  |#1205950404.:...|
00000fa0  20 00 8d 00 00 00 00 00  21 00 00 00 00 00 00 00  | .......!.......|
00000fb0  d0 cf 8c 00 00 00 00 00  10 d0 8c 00 00 00 00 00  |................|
00000fc0  00 00 00 00 00 00 00 00  41 00 00 00 00 00 00 00  |........A.......|
00000fd0  6d 6f 75 6e 74 20 2f 64  65 76 2f 76 7a 76 67 2f  |mount /dev/vzvg/|
00000fe0  74 65 73 74 20 20 2f 76  7a 20 2d 6f 20 64 61 74  |test  /vz -o dat|
00000ff0  61 3d 77 72 69 74 65 62  61 63 6b 00 00 00 00 00  |a=writeback.....|
00001000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

As you can see file's page contains garbage from pagecache instead of zeros.
#TEST_CASE_END

Attached patch:
- Add sanity check BUG_ON in order to prevent incorrect usage by caller,
  This is function invariant because page can has buffers and in no zero
  *fadata pointer at the same time.
- Always attach buffers to page is it is partial write case.
- Always switch back to generic_write_end if page has buffers.
  This is reasonable because if page already has buffer then generic_write_begin
  was called previously.

Signed-off-by: Dmitri Monakhov &lt;dmonakhov@openvz.org&gt;
Reviewed-by: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
</entry>
<entry>
<title>inotify: remove debug code</title>
<updated>2008-04-19T01:53:20Z</updated>
<author>
<name>Nick Piggin</name>
<email>npiggin@suse.de</email>
</author>
<published>2008-03-25T12:48:18Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=3d6fec02c6a996f658bdaa6a1da381f9b72da032'/>
<id>urn:sha1:3d6fec02c6a996f658bdaa6a1da381f9b72da032</id>
<content type='text'>
upstream commit: 0d71bd5993b630a989d15adc2562a9ffe41cd26d

The inotify debugging code is supposed to verify that the
DCACHE_INOTIFY_PARENT_WATCHED scalability optimisation does not result in
notifications getting lost nor extra needless locking generated.

Unfortunately there are also some races in the debugging code.  And it isn't
very good at finding problems anyway.  So remove it for now.

Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Cc: Robert Love &lt;rlove@google.com&gt;
Cc: John McCutchan &lt;ttb@tentacle.dhs.org&gt;
Cc: Jan Kara &lt;jack@ucw.cz&gt;
Cc: Yan Zheng &lt;yanzheng@21cn.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Christian Lamparter &lt;chunkeey@web.de&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
</entry>
<entry>
<title>inotify: fix race</title>
<updated>2008-04-19T01:53:20Z</updated>
<author>
<name>Nick Piggin</name>
<email>npiggin@suse.de</email>
</author>
<published>2008-03-25T12:48:15Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=4193242f7ca7c2626b440fe4e9dda57f2bcf0baa'/>
<id>urn:sha1:4193242f7ca7c2626b440fe4e9dda57f2bcf0baa</id>
<content type='text'>
upstream commit: d599e36a9ea85432587f4550acc113cd7549d12a

There is a race between setting an inode's children's "parent watched" flag
when placing the first watch on a parent, and instantiating new children of
that parent: a child could miss having its flags set by
set_dentry_child_flags, but then inotify_d_instantiate might still see
!inotify_inode_watched.

The solution is to set_dentry_child_flags after adding the watch.  Locking is
taken care of, because both set_dentry_child_flags and inotify_d_instantiate
hold dcache_lock and child-&gt;d_locks.

Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Cc: Robert Love &lt;rlove@google.com&gt;
Cc: John McCutchan &lt;ttb@tentacle.dhs.org&gt;
Cc: Jan Kara &lt;jack@ucw.cz&gt;
Cc: Yan Zheng &lt;yanzheng@21cn.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Christian Lamparter &lt;chunkeey@web.de&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
</entry>
<entry>
<title>aio: bad AIO race in aio_complete() leads to process hang</title>
<updated>2008-03-24T18:48:19Z</updated>
<author>
<name>Quentin Barnes</name>
<email>qbarnes+linux@yahoo-inc.com</email>
</author>
<published>2008-03-20T02:45:07Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=0db49fc729eee503836ea12745b55f7f802d2abb'/>
<id>urn:sha1:0db49fc729eee503836ea12745b55f7f802d2abb</id>
<content type='text'>
commit: 6cb2a21049b8990df4576c5fce4d48d0206c22d5

My group ran into a AIO process hang on a 2.6.24 kernel with the process
sleeping indefinitely in io_getevents(2) waiting for the last wakeup to come
and it never would.

We ran the tests on x86_64 SMP.  The hang only occurred on a Xeon box
("Clovertown") but not a Core2Duo ("Conroe").  On the Xeon, the L2 cache isn't
shared between all eight processors, but is L2 is shared between between all
two processors on the Core2Duo we use.

My analysis of the hang is if you go down to the second while-loop
in read_events(), what happens on processor #1:
	1) add_wait_queue_exclusive() adds thread to ctx-&gt;wait
	2) aio_read_evt() to check tail
	3) if aio_read_evt() returned 0, call [io_]schedule() and sleep

In aio_complete() with processor #2:
	A) info-&gt;tail = tail;
	B) waitqueue_active(&amp;ctx-&gt;wait)
	C) if waitqueue_active() returned non-0, call wake_up()

The way the code is written, step 1 must be seen by all other processors
before processor 1 checks for pending events in step 2 (that were recorded by
step A) and step A by processor 2 must be seen by all other processors
(checked in step 2) before step B is done.

The race I believed I was seeing is that steps 1 and 2 were
effectively swapped due to the __list_add() being delayed by the L2
cache not shared by some of the other processors.  Imagine:
proc 2: just before step A
proc 1, step 1: adds to ctx-&gt;wait, but is not visible by other processors yet
proc 1, step 2: checks tail and sees no pending events
proc 2, step A: updates tail
proc 1, step 3: calls [io_]schedule() and sleeps
proc 2, step B: checks ctx-&gt;wait, but sees no one waiting, skips wakeup
                so proc 1 sleeps indefinitely

My patch adds a memory barrier between steps A and B.  It ensures that the
update in step 1 gets seen on processor 2 before continuing.  If processor 1
was just before step 1, the memory barrier makes sure that step A (update
tail) gets seen by the time processor 1 makes it to step 2 (check tail).

Before the patch our AIO process would hang virtually 100% of the time.  After
the patch, we have yet to see the process ever hang.

Signed-off-by: Quentin Barnes &lt;qbarnes+linux@yahoo-inc.com&gt;
Reviewed-by: Zach Brown &lt;zach.brown@oracle.com&gt;
Cc: Benjamin LaHaise &lt;bcrl@kvack.org&gt;
Cc: &lt;stable@kernel.org&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
[ We should probably disallow that "if (waitqueue_active()) wake_up()"
  coding pattern, because it's so often buggy wrt memory ordering ]
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</content>
</entry>
<entry>
<title>jbd: correctly unescape journal data blocks</title>
<updated>2008-03-24T18:48:06Z</updated>
<author>
<name>Duane Griffin</name>
<email>duaneg@dghda.com</email>
</author>
<published>2008-03-20T02:45:06Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=d447e76ecaa4d4bb005eef4de5655a8418a4d60d'/>
<id>urn:sha1:d447e76ecaa4d4bb005eef4de5655a8418a4d60d</id>
<content type='text'>
commit: 439aeec639d7c57f3561054a6d315c40fd24bb74

Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping.  At the moment the wrong buffer head's
data is being unescaped.

To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JFS_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device.  Without this patch the files will contain zeros
instead of the correct data after recovery.

Signed-off-by: Duane Griffin &lt;duaneg@dghda.com&gt;
Acked-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: &lt;linux-ext4@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</content>
</entry>
<entry>
<title>jbd2: correctly unescape journal data blocks</title>
<updated>2008-03-24T18:48:03Z</updated>
<author>
<name>Duane Griffin</name>
<email>duaneg@dghda.com</email>
</author>
<published>2008-03-20T02:45:05Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=9ecfdfeaf6210f17b93f4130845a9349ab893196'/>
<id>urn:sha1:9ecfdfeaf6210f17b93f4130845a9349ab893196</id>
<content type='text'>
commit: d00256766a0b4f1441931a7f569a13edf6c68200

Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping.  At the moment the wrong buffer head's
data is being unescaped.

To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JBD2_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device.  Without this patch the files will contain zeros
instead of the correct data after recovery.

Signed-off-by: Duane Griffin &lt;duaneg@dghda.com&gt;
Acked-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: &lt;linux-ext4@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</content>
</entry>
<entry>
<title>zisofs: fix readpage() outside i_size</title>
<updated>2008-03-24T18:48:02Z</updated>
<author>
<name>Dave Young</name>
<email>hidave.darkstar@gmail.com</email>
</author>
<published>2008-03-20T02:45:04Z</published>
<link rel='alternate' type='text/html' href='https://git.amat.us/linux/commit/?id=6eb36c282b77ba9f392e7bc332f7fda80c310db6'/>
<id>urn:sha1:6eb36c282b77ba9f392e7bc332f7fda80c310db6</id>
<content type='text'>
commit: 08ca0db8aa2db4ddcf487d46d85dc8ffb22162cc

A read request outside i_size will be handled in do_generic_file_read().  So
we just return 0 to avoid getting -EIO as normal reading, let
do_generic_file_read do the rest.

At the same time we need unlock the page to avoid system stuck.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10227

Signed-off-by: Dave Young &lt;hidave.darkstar@gmail.com&gt;
Acked-by: Jan Kara &lt;jack@suse.cz&gt;
Report-by: Christian Perle &lt;chris@linuxinfotag.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</content>
</entry>
</feed>
