aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2010-08-13Btrfs: Fix race in btrfs_mark_extent_writtenYan, Zheng
commit 6c7d54ac87f338c479d9729e8392eca3f76e11e1 upstream. Fix bug reported by Johannes Hirte. The reason of that bug is btrfs_del_items is called after btrfs_duplicate_item and btrfs_del_items triggers tree balance. The fix is check that case and call btrfs_search_slot when needed. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs, fix memory leaks in error pathsJiri Slaby
commit 2423fdfb96e3f9ff3baeb6c4c78d74145547891d upstream. Stanse found 2 memory leaks in relocate_block_group and __btrfs_map_block. cluster and multi are not freed/assigned on all paths. Fix that. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: linux-btrfs@vger.kernel.org Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: align offsets for btrfs_ordered_update_i_sizeYan, Zheng
commit a038fab0cb873c75d6675e2bcffce8a3935bdce7 upstream. Some callers of btrfs_ordered_update_i_size can now pass in a NULL for the ordered extent to update against. This makes sure we properly align the offset they pass in when deciding how much to bump the on disk i_size. Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13btrfs: fix missing last-entry in readdir(3)Jan Engelhardt
commit 406266ab9ac8ed8b085c58aacd9e3161480dc5d5 upstream. parent 49313cdac7b34c9f7ecbb1780cfc648b1c082cd7 (v2.6.32-1-g49313cd) commit ff48c08e1c05c67e8348ab6f8a24de8034e0e34d Author: Jan Engelhardt <jengelh@medozas.de> Date: Wed Dec 9 22:57:36 2009 +0100 Btrfs: fix missing last-entry in readdir(3) When one does a 32-bit readdir(3), the last entry of a directory is missing. This is however not due to passing a large value to filldir, but it seems to have to do with glibc doing telldir or something quirky. In any case, this patch fixes it in practice. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: make sure fallocate properly starts a transactionChris Mason
commit 3a1abec9f6880cf406593c392636199ea1c6c917 upstream. The recent patch to make fallocate enospc friendly would send down a NULL trans handle to the allocator. This moves the transaction start to properly fix things. Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: make metadata chunks smallerJosef Bacik
commit 83d3c9696fed237a3d96fce18299e2fcf112109f upstream. This patch makes us a bit less zealous about making sure we have enough free metadata space by pearing down the size of new metadata chunks to 256mb instead of 1gb. Also, we used to try an allocate metadata chunks when allocating data, but that sort of thing is done elsewhere now so we can just remove it. With my -ENOSPC test I used to have 3gb reserved for metadata out of 75gb, now I have 1.7gb. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: Show discard option in /proc/mountsMatthew Wilcox
commit 20a5239a5d0f340e29827a6a2d28a138001c44b8 upstream. Christoph's patch e244a0aeb6a599c19a7c802cda6e2d67c847b154 doesn't display the discard option in /proc/mounts, leading to some confusion for me. Here's the missing bit. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: deny sys_link across subvolumes.TARUISI Hiroaki
commit 4a8be425a8fb8fbb5d881eb55fa6634c3463b9c9 upstream. I rebased Christian Parpart's patch to deny hard link across subvolumes. Original patch modifies also btrfs_rename, but I excluded it because we can move across subvolumes now and it make no problem. ----------------- Hard link across subvolumes should not allowed in Btrfs. btrfs_link checks root of 'to' directory is same as root of 'from' file. If not same, btrfs_link returns -EPERM. Signed-off-by: TARUISI Hiroaki <taruishi.hiroak@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: fail mount on bad mount optionsSage Weil
commit a7a3f7cadd9bdee569243f7ead9550aa16b60e07 upstream. We shouldn't silently ignore unrecognized options. Signed-off-by: Sage Weil <sage@newdream.net> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: don't add extent 0 to the free space cache v2Yan, Zheng
commit 06b2331f8333ec6edf41662757ce8882cc1747d5 upstream. If block group 0 is completely free, btrfs_read_block_groups will add extent [0, BTRFS_SUPER_INFO_OFFSET) to the free space cache. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: Fix per root used space accountingYan, Zheng
commit 86b9f2eca5e0984145e3c7698a7cd6dd65c2a93f upstream. The bytes_used field in root item was originally planned to trace the amount of used data and tree blocks. But it never worked right since we can't trace freeing of data accurately. This patch changes it to only trace the amount of tree blocks. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: Fix btrfs_drop_extent_cache for skip pinned caseYan, Zheng
commit 55ef68990029fcd8d04d42fc184aa7fb18cf309e upstream. The check for skip pinned case is wrong, it may breaks the while loop too soon. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: Add delayed iputYan, Zheng
commit 24bbcf0442ee04660a5a030efdbb6d03f1c275cb upstream. iput() can trigger new transactions if we are dropping the final reference, so calling it in btrfs_commit_transaction may end up deadlock. This patch adds delayed iput to avoid the issue. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: Pass transaction handle to security and ACL initialization functionsYan, Zheng
commit f34f57a3ab4e73304d78c125682f1a53cd3975f2 upstream. Pass transaction handle down to security and ACL initialization functions, so we can avoid starting nested transactions Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: Make truncate(2) more ENOSPC friendlyYan, Zheng
commit 8082510e7124cc50d728f1b875639cb4e22312cc upstream. truncating and deleting regular files are unbound operations, so it's not good to do them in a single transaction. This patch makes btrfs_truncate and btrfs_delete_inode start a new transaction after all items in a tree leaf are deleted. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: Make fallocate(2) more ENOSPC friendlyYan, Zheng
commit 5a303d5d4b8055d2e5a03e92d04745bfc5881a22 upstream. fallocate(2) may allocate large number of file extents, so it's not good to do it in a single transaction. This patch make fallocate(2) start a new transaction for each file extents it allocates. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: Avoid orphan inodes cleanup during committing transactionYan, Zheng
commit 2e4bfab97055aa6acdd0637913bd705c2d6506d6 upstream. btrfs_lookup_dentry may trigger orphan cleanup, so it's not good to call it while committing a transaction. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: Avoid orphan inodes cleanup while replaying logYan, Zheng
commit c71bf099abddf3e0fdc27f251ba76fca1461d49a upstream. We do log replay in a single transaction, so it's not good to do unbound operations. This patch cleans up orphan inodes cleanup after replaying the log. It also avoids doing other unbound operations such as truncating a file during replaying log. These unbound operations are postponed to the orphan inode cleanup stage. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: Fix disk_i_size update corner caseYan, Zheng
commit c216775458a2ee345d9412a2770c2916acfb5d30 upstream. There are some cases file extents are inserted without involving ordered struct. In these cases, we update disk_i_size directly, without checking pending ordered extent and DELALLOC bit. This patch extends btrfs_ordered_update_i_size() to handle these cases. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: Rewrite btrfs_drop_extentsYan, Zheng
commit 920bbbfb05c9fce22e088d20eb9dcb8f96342de9 upstream. Rewrite btrfs_drop_extents by using btrfs_duplicate_item, so we can avoid calling lock_extent within transaction. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: Add btrfs_duplicate_itemYan, Zheng
commit ad48fd754676bfae4139be1a897b1ea58f9aaf21 upstream. btrfs_duplicate_item duplicates item with new key, guaranteeing the source item and the new items are in the same tree leaf and contiguous. It allows us to split file extent in place, without using lock_extent to prevent bookend extent race. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13Btrfs: Avoid superfluous tree-log writeoutYan, Zheng
commit 8cef4e160d74920ad1725f58c89fd75ec4c4ac38 upstream. We allow two log transactions at a time, but use same flag to mark dirty tree-log btree blocks. So we may flush dirty blocks belonging to newer log transaction when committing a log transaction. This patch fixes the issue by using two flags to mark dirty tree-log btree blocks. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13ext4: fix freeze deadlock under IOEric Sandeen
commit 437f88cc031ffe7f37f3e705367f4fe1f4be8b0f upstream. Commit 6b0310fbf087ad6 caused a regression resulting in deadlocks when freezing a filesystem which had active IO; the vfs_check_frozen level (SB_FREEZE_WRITE) did not let the freeze-related IO syncing through. Duh. Changing the test to FREEZE_TRANS should let the normal freeze syncing get through the fs, but still block any transactions from starting once the fs is completely frozen. I tested this by running fsstress in the background while periodically snapshotting the fs and running fsck on the result. I ran into occasional deadlocks, but different ones. I think this is a fine fix for the problem at hand, and the other deadlocky things will need more investigation. Reported-by: Phillip Susi <psusi@cfl.rr.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13jfs: don't allow os2 xattr namespace overlap with othersDave Kleikamp
commit aca0fa34bdaba39bfddddba8ca70dba4782e8fe6 upstream. It's currently possible to bypass xattr namespace access rules by prefixing valid xattr names with "os2.", since the os2 namespace stores extended attributes in a legacy format with no prefix. This patch adds checking to deny access to any valid namespace prefix following "os2.". Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Reported-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13signalfd: fill in ssi_int for posix timers and message queuesNathan Lynch
commit a2a20c412c86e0bb46a9ab0dd31bcfe6d201b913 upstream. If signalfd is used to consume a signal generated by a POSIX interval timer or POSIX message queue, the ssi_int field does not reflect the data (sigevent->sigev_value) supplied to timer_create(2) or mq_notify(3). (The ssi_ptr field, however, is filled in.) This behavior differs from signalfd's treatment of sigqueue-generated signals -- see the default case in signalfd_copyinfo. It also gives results that differ from the case when a signal is handled conventionally via a sigaction-registered handler. So, set signalfd_siginfo->ssi_int in the remaining cases (__SI_TIMER, __SI_MESGQ) where ssi_ptr is set. akpm: a non-back-compatible change. Merge into -stable to minimise the number of kernels which are in the field and which miss this feature. Signed-off-by: Nathan Lynch <ntl@pobox.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13fs/ecryptfs/file.c: introduce missing freeJulia Lawall
commit ceeab92971e8af05c1e81a4ff2c271124b55bb9b upstream. The comments in the code indicate that file_info should be released if the function fails. This releasing is done at the label out_free, not out. The semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @r exists@ local idexpression x; statement S; expression E; identifier f,f1,l; position p1,p2; expression *ptr != NULL; @@ x@p1 = kmem_cache_zalloc(...); ... if (x == NULL) S <... when != x when != if (...) { <+...x...+> } ( x->f1 = E | (x->f1 == NULL || ...) | f(...,x->f1,...) ) ...> ( return <+...x...+>; | return@p2 ...; ) @script:python@ p1 << r.p1; p2 << r.p2; @@ print "* file: %s kmem_cache_zalloc %s" % (p1[0].file,p1[0].line) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13ecryptfs: release reference to lower mount if interpose failsLino Sanfilippo
commit 31f73bee3e170b7cabb35db9e2f4bf7919b9d036 upstream. In ecryptfs_lookup_and_interpose_lower() the lower mount is not decremented if allocation of a dentry info struct failed. As a result the lower filesystem cant be unmounted any more (since it is considered busy). This patch corrects the reference counting. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13eCryptfs: Handle ioctl calls with unlocked and compat functionsTyler Hicks
commit c43f7b8fb03be8bcc579bfc4e6ab70eac887ab55 upstream. Lower filesystems that only implemented unlocked_ioctl weren't being passed ioctl calls because eCryptfs only checked for lower_file->f_op->ioctl and returned -ENOTTY if it was NULL. eCryptfs shouldn't implement ioctl(), since it doesn't require the BKL. This patch introduces ecryptfs_unlocked_ioctl() and ecryptfs_compat_ioctl(), which passes the calls on to the lower file system. https://bugs.launchpad.net/ecryptfs/+bug/469664 Reported-by: James Dupin <james.dupin@gmail.com> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13blkdev: cgroup whitelist permission fixChris Wright
commit b7300b78d1a87625975a799a109a2f98d77757c8 upstream. The cgroup device whitelist code gets confused when trying to grant permission to a disk partition that is not currently open. Part of blkdev_open() includes __blkdev_get() on the whole disk. Basically, the only ways to reliably allow a cgroup access to a partition on a block device when using the whitelist are to 1) also give it access to the whole block device or 2) make sure the partition is already open in a different context. The patch avoids the cgroup check for the whole disk case when opening a partition. Addresses https://bugzilla.redhat.com/show_bug.cgi?id=589662 Signed-off-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Serge E. Hallyn <serue@us.ibm.com> Tested-by: Serge E. Hallyn <serue@us.ibm.com> Reported-by: Vivek Goyal <vgoyal@redhat.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: "Daniel P. Berrange" <berrange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-13splice: fix misuse of SPLICE_F_NONBLOCKMiklos Szeredi
commit 6965031d331a642e31278fa1b5bd47f372ffdd5d upstream. SPLICE_F_NONBLOCK is clearly documented to only affect blocking on the pipe. In __generic_file_splice_read(), however, it causes an EAGAIN if the page is currently being read. This makes it impossible to write an application that only wants failure if the pipe is full. For example if the same process is handling both ends of a pipe and isn't otherwise able to determine whether a splice to the pipe will fill it or not. We could make the read non-blocking on O_NONBLOCK or some other splice flag, but for now this is the simplest fix. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-10CIFS: Fix compile error with __init in cifs_init_dns_resolver() definitionMichael Neuling
An allmodconfig compile on ppc64 with 2.6.32.17 currently gives this error fs/cifs/dns_resolve.h:27: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'cifs_init_dns_resolver' This adds the correct header file to fix this. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-10CIFS: Remove __exit mark from cifs_exit_dns_resolver()David Howells
commit 51c20fcced5badee0e2021c6c89f44aa3cbd72aa upstream. Remove the __exit mark from cifs_exit_dns_resolver() as it's called by the module init routine in case of error, and so may have been discarded during linkage. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-10GFS2: rename causes kernel OopsBob Peterson
commit 728a756b8fcd22d80e2dbba8117a8a3aafd3f203 upstream. This patch fixes a kernel Oops in the GFS2 rename code. The problem was in the way the gfs2 directory code was trying to re-use sentinel directory entries. In the failing case, gfs2's rename function was renaming a file to another name that had the same non-trivial length. The file being renamed happened to be the first directory entry on the leaf block. First, the rename code (gfs2_rename in ops_inode.c) found the original directory entry and decided it could do its job by simply replacing the directory entry with another. Therefore it determined correctly that no block allocations were needed. Next, the rename code deleted the old directory entry prior to replacing it with the new name. Therefore, the soon-to-be replaced directory entry was temporarily made into a directory entry "sentinel" or a place holder at the start of a leaf block. Lastly, it went to re-add the replacement directory entry in that leaf block. However, when gfs2_dirent_find_space was looking for space in the leaf block, it used the wrong value for the sentinel. That threw off its calculations so later it decides it can't really re-use the sentinel and therefore must allocate a new leaf block. But because it previously decided to re-use the directory entry, it didn't waste the time to grab a new block allocation for the inode. Therefore, the inode's i_alloc pointer was still NULL and it crashes trying to reference it. In the case of sentinel directory entries, the entire dirent is reused, not just the "free space" portion of it, and therefore the function gfs2_dirent_find_space should use the value 0 rather than GFS2_DIRENT_SIZE(0) for the actual dirent size. Fixing this calculation enables the reproducer programs to work properly. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-10xfs: prevent swapext from operating on write-only filesDan Rosenberg
commit 1817176a86352f65210139d4c794ad2d19fc6b63 upstream. This patch prevents user "foo" from using the SWAPEXT ioctl to swap a write-only file owned by user "bar" into a file owned by "foo" and subsequently reading it. It does so by checking that the file descriptors passed to the ioctl are also opened for reading. Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-10NFS: kswapd must not block in nfs_release_pageTrond Myklebust
commit b608b283a962caaa280756bc8563016a71712acf upstream See https://bugzilla.kernel.org/show_bug.cgi?id=16056 If other processes are blocked waiting for kswapd to free up some memory so that they can make progress, then we cannot allow kswapd to block on those processes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-10acl trouble after upgrading ubuntuJ. Bruce Fields
commit d327cf7449e6fd5cbac784c641770e9366faa386 upstream. Subject: nfs: fix acl decoding Commit 28f566942c6b1d929f5e240e69e7081b77b238d3 "NFS: use dynamically computed compound_hdr.replen for xdr_inline_pages offset" accidentally changed the amount of space to allow for the acl reply, resulting in an IO error on attempts to get an acl. Reported-by: Paul Rudin <paul@rudin.co.uk> Cc: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Jeremy Kerr <jeremy.kerr@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-02ecryptfs: Bugfix for error related to ecryptfs_hash_bucketsAndre Osterhues
commit a6f80fb7b5986fda663d94079d3bba0937a6b6ff upstream. The function ecryptfs_uid_hash wrongly assumes that the second parameter to hash_long() is the number of hash buckets instead of the number of hash bits. This patch fixes that and renames the variable ecryptfs_hash_buckets to ecryptfs_hash_bits to make it clearer. Fixes: CVE-2010-2492 Signed-off-by: Andre Osterhues <aosterhues@escrypt.com> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-02GFS2: Fix up system xattrsSteven Whitehouse
commit 2646a1f61a3b5525914757f10fa12b5b94713648 upstream. This code has been shamelessly stolen from XFS at the suggestion of Christoph Hellwig. I've not added support for cached ACLs so far... watch for that in a later patch, although this is designed in such a way that they should be easy to add. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-02ext4: Make fsync sync new parent directories in no-journal modeFrank Mayhar
commit 14ece1028b3ed53ffec1b1213ffc6acaf79ad77c upstream (as of v2.6.34-git13) Add a new ext4 state to tell us when a file has been newly created; use that state in ext4_sync_file in no-journal mode to tell us when we need to sync the parent directory as well as the inode and data itself. This fixes a problem in which a panic or power failure may lose the entire file even when using fsync, since the parent directory entry is lost. Addresses-Google-Bug: #2480057 Signed-off-by: Frank Mayhar <fmayhar@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-02ext4: Fix compat EXT4_IOC_ADD_GROUPBen Hutchings
commit 4d92dc0f00a775dc2e1267b0e00befb783902fe7 upstream (as of v2.6.34-git13) struct ext4_new_group_input needs to be converted because u64 has only 32-bit alignment on some 32-bit architectures, notably i386. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-02ext4: Conditionally define compat ioctl numbersBen Hutchings
commit 899ad0cea6ad7ff4ba24b16318edbc3cbbe03fad upstream (as of v2.6.34-git13) It is unnecessary, and in general impossible, to define the compat ioctl numbers except when building the filesystem with CONFIG_COMPAT defined. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-02ext4: restart ext4_ext_remove_space() after transaction restartDmitry Monakhov
commit 0617b83fa239db9743a18ce6cc0e556f4d0fd567 upstream (as of v2.6.34-git13) If i_data_sem was internally dropped due to transaction restart, it is necessary to restart path look-up because extents tree was possibly modified by ext4_get_block(). https://bugzilla.kernel.org/show_bug.cgi?id=15827 Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-02ext4: Clear the EXT4_EOFBLOCKS_FL flag only when warrantedTheodore Ts'o
commit 786ec7915e530936b9eb2e3d12274145cab7aa7d upstream (as of v2.6.34-git13) Dimitry Monakhov discovered an edge case where it was possible for the EXT4_EOFBLOCKS_FL flag could get cleared unnecessarily. This is true; I have a test case that can be exercised via downloading and decompressing the file: wget ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/ext4-testcases/eofblocks-fl-test-case.img.bz2 bunzip2 eofblocks-fl-test-case.img dd if=/dev/zero of=eofblocks-fl-test-case.img bs=1k seek=17925 bs=1k count=1 conv=notrunc However, triggering it in real life is highly unlikely since it requires an extremely fragmented sparse file with a hole in exactly the right place in the extent tree. (It actually took quite a bit of work to generate this test case.) Still, it's nice to get even extreme corner cases to be correct, so this patch makes sure that we don't clear the EXT4_EOFBLOCKS_FL incorrectly even in this corner case. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-02ext4: Avoid crashing on NULL ptr dereference on a filesystem errorTheodore Ts'o
commit f70f362b4a6fe47c239dbfb3efc0cc2c10e4f09c upstream (as of v2.6.34-git13) If the EOFBLOCK_FL flag is set when it should not be and the inode is zero length, then eh_entries is zero, and ex is NULL, so dereferencing ex to print ex->ee_block causes a kernel OOPS in ext4_ext_map_blocks(). On top of that, the error message which is printed isn't very helpful. So we fix this by printing something more explanatory which doesn't involve trying to print ex->ee_block. Addresses-Google-Bug: #2655740 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-02ext4: Use bitops to read/modify i_flags in struct ext4_inode_infoDmitry Monakhov
commit 12e9b892002d9af057655d35b44db8ee9243b0dc upstream (as of v2.6.34-git13) At several places we modify EXT4_I(inode)->i_flags without holding i_mutex (ext4_do_update_inode, ...). These modifications are racy and we can lose updates to i_flags. So convert handling of i_flags to use bitops which are atomic. https://bugzilla.kernel.org/show_bug.cgi?id=15792 Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-02ext4: Show journal_checksum optionJan Kara
commit 39a4bade8c1826b658316d66ee81c09b0a4d7d42 upstream (as of v2.6.34-git13) We failed to show journal_checksum option in /proc/mounts. Fix it. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-02ext4: check for a good block group before loading buddy pagesCurt Wohlgemuth
commit 8a57d9d61a6e361c7bb159dda797672c1df1a691 upstream (as of v2.6.34-git13) This adds a new field in ext4_group_info to cache the largest available block range in a block group; and don't load the buddy pages until *after* we've done a sanity check on the block group. With large allocation requests (e.g., fallocate(), 8MiB) and relatively full partitions, it's easy to have no block groups with a block extent large enough to satisfy the input request length. This currently causes the loop during cr == 0 in ext4_mb_regular_allocator() to load the buddy bitmap pages for EVERY block group. That can be a lot of pages. The patch below allows us to call ext4_mb_good_group() BEFORE we load the buddy pages (although we have check again after we lock the block group). Addresses-Google-Bug: #2578108 Addresses-Google-Bug: #2704453 Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-02ext4: Prevent creation of files larger than RLIMIT_FSIZE using fallocateNikanth Karthikesan
commit 6d19c42b7cf81c39632b6d4dbc514e8449bcd346 upstream (as of v2.6.34-git13) Currently using posix_fallocate one can bypass an RLIMIT_FSIZE limit and create a file larger than the limit. Add a check for that. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Amit Arora <aarora@in.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-02ext4: Remove extraneous newlines in ext4_msg() callsCurt Wohlgemuth
commit fbe845ddf368f77f86aa7500f8fd2690f54c66a8 upstream (as of v2.6.34-git13) Addresses-Google-Bug: #2562325 Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-08-02ext4: init statistics after journal recoveryDmitry Monakhov
commit 84061e07c5fbbbf9dc8aef8fb750fc3a2dfc31f3 upstream (as of v2.6.34-git13) Currently block/inode/dir counters initialized before journal was recovered. In fact after journal recovery this info will probably change. And freeblocks it critical for correct delalloc mode accounting. https://bugzilla.kernel.org/show_bug.cgi?id=15768 Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>