aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2006-06-26[PATCH] proc: Kill proc_mem_inode_operationsEric W. Biederman
The inode operations only exist to support the proc_permission function. Currently mem_read and mem_write have all the same permission checks as ptrace. The fs check makes no sense in this context, and we can trivially get around it by calling ptrace. So simply the code by killing the strange weird case. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] proc: Remove bogus proc_task_permissionEric W. Biederman
First we can access every /proc/<tgid>/task/<pid> directory as /proc/<pid> so proc_task_permission is not usefully limiting visibility. Second having related filesystems information should have nothing to do with process visibility. kill does not implement any checks like that. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] proc: Replace proc_inode.type with proc_inode.fdEric W. Biederman
The sole renaming use of proc_inode.type is to discover the file descriptor number, so just store the file descriptor number and don't wory about processing this field. This removes any /proc limits on the maximum number of file descriptors, and clears the path to make the hard coded /proc inode numbers go away. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] proc: Simplify the ownership rules for /procEric W. Biederman
Currently in /proc if the task is dumpable all of files are owned by the tasks effective users. Otherwise the files are owned by root. Unless it is the /proc/<tgid>/ or /proc/<tgid>/task/<pid> directory in that case we always make the directory owned by the effective user. However the special case for directories is pointless except as a way to read the effective user, because the permissions on both of those directories are world readable, and executable. /proc/<tgid>/status provides a much better way to read a processes effecitve userid, so it is silly to try to provide that on the directory. So this patch simplifies the code by removing a pointless special case and gets us one step closer to being able to remove the hard coded /proc inode numbers. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] proc: Remove unnecessary and misleading assignments from ↵Eric W. Biederman
proc_pid_make_inode The removed fields are already set by proc_alloc_inode. Initializing them in proc_alloc_inode implies they need it for proper cleanup. At least ei->pde was not set on all paths making it look like proc_alloc_inode was buggy. So just remove the redundant assignments. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] proc: Remove useless BKL in proc_pid_readlinkEric W. Biederman
We already call everything except do_proc_readlink outside of the BKL in proc_pid_followlink, and there appears to be nothing in do_proc_readlink that needs any special protection. So remove this leftover from one of the BKL cleanup efforts. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] proc: Fix the .. inode number on /proc/<pid>/fdEric W. Biederman
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] nfsd kconfig: select things at the closest tristate instead of boolHerbert Xu
I noticed recently that my CONFIG_CRYPTO_MD5 turned into a y again instead of m. It turns out that CONFIG_NFSD_V4 is selecting it to be y even though I've chosen to compile nfsd as a module. In general when we have a bool sitting under a tristate it is better to select things you need from the tristate rather than the bool since that allows the things you select to be modules. The following patch does it for nfsd. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] i4l: Gigaset drivers: add IOCTLs to compat_ioctl.hHansjoerg Lipp
Add the IOCTLs of the Gigaset drivers to compat_ioctl.h in order to make them available for 32 bit programs on 64 bit platforms. Please merge. Signed-off-by: Hansjoerg Lipp <hjlipp@web.de> Acked-by: Tilman Schmidt <tilman@imap.cc> Cc: Karsten Keil <kkeil@suse.de> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] ext3: Add "-o bh" optionBadari Pulavarty
This patch adds "-o bh" option to force use of buffer_heads. This option is needed when we make "nobh" as default - and if we run into problems. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] reiserfs: remove reiserfs_aio_write()Alexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] keys: add a way to store the appropriate context for newly-created keysMichael LeMay
Add a /proc/<pid>/attr/keycreate entry that stores the appropriate context for newly-created keys. Modify the selinux_key_alloc hook to make use of the new entry. Update the flask headers to include a new "setkeycreate" permission for processes. Update the flask headers to include a new "create" permission for keys. Use the create permission to restrict which SIDs each task can assign to newly-created keys. Add a new parameter to the security hook "security_key_alloc" to indicate whether it is being invoked by the kernel, or from userspace. If it is being invoked by the kernel, the security hook should never fail. Update the documentation to reflect these changes. Signed-off-by: Michael LeMay <mdlemay@epoch.ncsc.mil> Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] fs: use list_move()Akinobu Mita
This patch converts the combination of list_del(A) and list_add(A, B) to list_move(A, B) under fs/. Cc: Ian Kent <raven@themaw.net> Acked-by: Joel Becker <joel.becker@oracle.com> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Hans Reiser <reiserfs-dev@namesys.com> Cc: Urban Widmark <urban@teststation.com> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] core: use list_move()Akinobu Mita
This patch converts the combination of list_del(A) and list_add(A, B) to list_move(A, B). Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] use list_add_tail() instead of list_add()Akinobu Mita
This patch converts list_add(A, B.prev) to list_add_tail(A, &B) for readability. Acked-by: Karsten Keil <kkeil@suse.de> Cc: Jan Harkes <jaharkes@cs.cmu.edu> Acked-by: Jan Kara <jack@suse.cz> AOLed-by: David Woodhouse <dwmw2@infradead.org> Cc: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] uclinux: use PER_LINUX_32BIT in binfmt_flatMalcolm Parsons
binfmt_flat.c calls set_personality with PER_LINUX as the personality. On the arm architecture this results in the program running in 26bit usermode. PER_LINUX_32BIT should be used instead. This doesn't affect other architectures that use binfmt_flat. Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] xfs: update ->flush method protoAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Nathan Scott <nathans@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25Fix NFS2 compile errorLinus Torvalds
Trond had apparently merged the same patch twice, causing a duplicate include of the "internal.h" file, with resulting obvious confusion. Tssk. I'm the only one allowed to send out trees that don't even compile! Who does this Trond guy think he is? Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25Merge git://git.linux-nfs.org/pub/linux/nfs-2.6Linus Torvalds
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (51 commits) nfs: remove nfs_put_link() nfs-build-fix-99 git-nfs-build-fixes Merge branch 'odirect' NFS: alloc nfs_read/write_data as direct I/O is scheduled NFS: Eliminate nfs_get_user_pages() NFS: refactor nfs_direct_free_user_pages NFS: remove user_addr, user_count, and pos from nfs_direct_req NFS: "open code" the NFS direct write rescheduler NFS: Separate functions for counting outstanding NFS direct I/Os NLM: Fix reclaim races NLM: sem to mutex conversion locks.c: add the fl_owner to nlm_compare_locks NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mounts NFS: Split fs/nfs/inode.c NFS: Fix typo in nfs_do_clone_mount() NFS: Fix compile errors introduced by referrals patches NFSv4: Ensure that referral mounts bind to a reserved port NFSv4: A root pathname is sent as a zero component4 NFSv4: Follow a referral ...
2006-06-25Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvbLinus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (244 commits) V4L/DVB (4210b): git-dvb: tea575x-tuner build fix V4L/DVB (4210a): git-dvb versus matroxfb V4L/DVB (4209): Added some BTTV PCI IDs for newer boards Fixes some sync issues between V4L/DVB development and GIT V4L/DVB (4206): Cx88-blackbird: always set encoder height based on tvnorm->id V4L/DVB (4205): Merge tda9887 module into tuner. V4L/DVB (4203): Explicitly set the enum values. V4L/DVB (4202): allow selecting CX2341x port mode V4L/DVB (4200): Disable bitrate_mode when encoding mpeg-1. V4L/DVB (4199): Add cx2341x-specific control array to cx2341x.c V4L/DVB (4198): Avoid newer usages of obsoleted experimental MPEGCOMP API V4L/DVB (4197): Port new MPEG API to saa7134-empress with saa6752hs V4L/DVB (4196): Port cx88-blackbird to the new MPEG API. V4L/DVB (4193): Update cx2341x fw encoding API doc. V4L/DVB (4192): Use control helpers for saa7115, cx25840, msp3400. V4L/DVB (4191): Add CX2341X MPEG encoder module. V4L/DVB (4190): Add helper functions for control processing to v4l2-common. V4L/DVB (4189): Add videodev support for VIDIOC_S/G/TRY_EXT_CTRLS. V4L/DVB (4188): Add new MPEG control/ioctl definitions to videodev2.h V4L/DVB (4186): Add support for the DNTV Live! mini DVB-T card. ...
2006-06-25[PATCH] remove unlikely(sb) in prune_dcacheHua Zhong
likely profiling shows that the following is a miss. After boot: [+- ] Type | # True | # False | Function:Filename@Line +unlikely | 1074| 0 prune_dcache()@:fs/dcache.c@409 After a bonnie++ run: +unlikely | 66716| 19584 prune_dcache()@:fs/dcache.c@409 So remove it. Signed-off-by: Hua Zhong <hzhong@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] ext2: cleanup: put_page and comment fixEvgeniy Dushistov
Things which force me think a little: why so? Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] Implement AT_SYMLINK_FOLLOW flag for linkatUlrich Drepper
When the linkat() syscall was added the flag parameter was added in the last minute but it wasn't used so far. The following patch should change that. My tests show that this is all that's needed. If OLDNAME is a symlink setting the flag causes linkat to follow the symlink and create a hardlink with the target. This is actually the behavior POSIX demands for link() as well but Linux wisely does not do this. With this flag (which will most likely be in the next POSIX revision) the programmer can choose the behavior, defaulting to the safe variant. As a side effect it is now possible to implement a POSIX-compliant link(2) function for those who are interested. touch file ln -s file symlink linkat(fd, "symlink", fd, "newlink", 0) -> newlink is hardlink of symlink linkat(fd, "symlink", fd, "newlink", AT_SYMLINK_FOLLOW) -> newlink is hardlink of file The value of AT_SYMLINK_FOLLOW is determined by the definition we already use in glibc. Signed-off-by: Ulrich Drepper <drepper@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] fs: sys_poll with timeout -1 bug fixFrode Isaksen
If you do a poll() call with timeout -1, the wait will be a big number (depending on HZ) instead of infinite wait, since -1 is passed to the msecs_to_jiffies function. Signed-off-by: Frode Isaksen <frode.isaksen@gmail.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] fix %s in affs_fill_super()Al Viro
%s is only valid if array is known to contain NUL or precision is given and does not exceed the size of array. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] kthread: convert smbiodSerge E. Hallyn
Update smbiod to use kthread instead of deprecated kernel_thread. [akpm@osdl.org: cleanup] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] Remove needless checks in fs/9p/vfs_inode.cEric Sesterhenn
coverity found two needless checks in vfs_inode.c (cid #1165 and #1164) In both cases inode is always NULL when we goto error; either because it is still initialized to NULL or is set to NULL explicitly. This patch simply removes these checks to save some code. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Acked-by: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@lanl.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] fuse: scramble lock owner IDMiklos Szeredi
VFS uses current->files pointer as lock owner ID, and it wouldn't be prudent to expose this value to userspace. So scramble it with XTEA using a per connection random key, known only to the kernel. Only one direction needs to be implemented, since the ID is never sent in the reverse direction. The XTEA algorithm is implemented inline since it's simple enough to do so, and this adds less complexity than if the crypto API were used. Thanks to Jesper Juhl for the idea. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] fuse: add request interruptionMiklos Szeredi
Add synchronous request interruption. This is needed for file locking operations which have to be interruptible. However filesystem may implement interruptibility of other operations (e.g. like NFS 'intr' mount option). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] fuse: rename the interrupted flagMiklos Szeredi
Rename the 'interrupted' flag to 'aborted', since it indicates exactly that, and next patch will introduce an 'interrupted' flag for a Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] fuse: ensure FLUSH reaches userspaceMiklos Szeredi
All POSIX locks owned by the current task are removed on close(). If the FLUSH request resulting initiated by close() fails to reach userspace, there might be locks remaining, which cannot be removed. The only reason it could fail, is if allocating the request fails. In this case use the request reserved for RELEASE, or if that is currently used by another FLUSH, wait for it to become available. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] fuse: add POSIX file locking supportMiklos Szeredi
This patch adds POSIX file locking support to the fuse interface. This implementation doesn't keep any locking state in kernel. Unlocking on close() is handled by the FLUSH message, which now contains the lock owner id. Mandatory locking is not supported. The filesystem may enfoce mandatory locking in userspace if needed. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] fuse: add control filesystemMiklos Szeredi
Add a control filesystem to fuse, replacing the attributes currently exported through sysfs. An empty directory '/sys/fs/fuse/connections' is still created in sysfs, and mounting the control filesystem here provides backward compatibility. Advantages of the control filesystem over the previous solution: - allows the object directory and the attributes to be owned by the filesystem owner, hence letting unpriviled users abort the filesystem connection - does not suffer from module unload race [akpm@osdl.org: fix this fs for recent dhowells depredations] [akpm@osdl.org: fix 64-bit printk warnings] Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] fuse: no backgrounding on interruptMiklos Szeredi
Don't put requests into the background when a fatal interrupt occurs while the request is in userspace. This removes a major wart from the implementation. Backgrounding of requests was introduced to allow breaking of deadlocks. However now the same can be achieved by aborting the filesystem through the 'abort' sysfs attribute. This is a change in the interface, but should not cause problems, since these kinds of deadlocks never happen during normal operation. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] autofs4: need to invalidate children on tree mount expireIan Kent
I've found a case where invalid dentrys in a mount tree, waiting to be cleaned up by d_invalidate, prevent the expected expire. In this case dentrys created during a lookup for which a mount fails or has no entry in the mount map contribute to the d_count of the parent dentry. These dentrys may not be invalidated prior to comparing the interanl usage count of valid autofs dentrys against the dentry d_count which makes a mount tree appear busy so it doesn't expire. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] ftruncate does not always update m/ctimePeter Staubach
In the course of trying to track down a bug where a file mtime was not being updated correctly, it was discovered that the m/ctime updates were not quite being handled correctly for ftruncate() calls. Quoth SUSv3: open(2): If O_TRUNC is set and the file did previously exist, upon successful completion, open() shall mark for update the st_ctime and st_mtime fields of the file. truncate(2): Upon successful completion, if the file size is changed, this function shall mark for update the st_ctime and st_mtime fields of the file, and the S_ISUID and S_ISGID bits of the file mode may be cleared. ftruncate(2): Upon successful completion, if fildes refers to a regular file, the ftruncate() function shall mark for update the st_ctime and st_mtime fields of the file and the S_ISUID and S_ISGID bits of the file mode may be cleared. If the ftruncate() function is unsuccessful, the file is unaffected. The open(O_TRUNC) and truncate cases were being handled correctly, but the ftruncate case was being handled like the truncate case. The semantics of truncate and ftruncate don't quite match, so ftruncate needs to be handled slightly differently. The attached patch addresses this issue for ftruncate(2). My thanx to Stephen Tweedie and Trond Myklebust for their help in understanding the situation and semantics. Signed-off-by: Peter Staubach <staubach@redhat.com> Cc: "Stephen C. Tweedie" <sct@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] ext3: cleanup dead code in ext3_add_entry()Johann Lombardi
The variables nlen and rlen are defined/initialized but not used in ext3_add_entry(). Signed-off-by: Johann Lombardi <johann.lombardi@bull.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] 9pfs: missing result check in v9fs_vfs_readlink() and v9fs_vfs_link()Florin Malita
__getname() may fail and return NULL (as pointed out by Coverity 437 & 1220). Signed-off-by: Florin Malita <fmalita@gmail.com> Acked-by: Eric Van Hensbergen <ericvh@gmail.com> Cc: <rminnich@lanl.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] epoll: use unlocked wqueue operationsDavide Libenzi
A few days ago Arjan signaled a lockdep red flag on epoll locks, and precisely between the epoll's device structure lock (->lock) and the wait queue head lock (->lock). Like I explained in another email, and directly to Arjan, this can't happen in reality because of the explicit check at eventpoll.c:592, that does not allow to drop an epoll fd inside the same epoll fd. Since lockdep is working on per-structure locks, it will never be able to know of policies enforced in other parts of the code. It was decided time ago of having the ability to drop epoll fds inside other epoll fds, that triggers a very trick wakeup operations (due to possibly reentrant callback-driven wakeups) handled by the ep_poll_safewake() function. While looking again at the code though, I noticed that all the operations done on the epoll's main structure wait queue head (->wq) are already protected by the epoll lock (->lock), so that locked-style functions can be used to manipulate the ->wq member. This makes both a lock-acquire save, and lockdep happy. Running totalmess on my dual opteron for a while did not reveal any problem so far: http://www.xmailserver.org/totalmess.c Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] Make EXT2_DEBUG work againValerie Henson
This patch makes EXT2_DEBUG work again. Due to lack of proper include file, EXT2_DEBUG was undefined in bitmap.c and ext2_count_free() is left out. Moved to balloc.c and removed bitmap.c entirely. Second, debug versions of ext2_count_free_{inodes/blocks} reacquires superblock lock. Moved lock into callers. Signed-off-by: Val Henson <val_henson@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] Make procfs obligatory except under CONFIG_EMBEDDEDH. Peter Anvin
Make procfs non-optional unless EMBEDDED is set, just like sysfs. procfs is already de facto required for a large subset of Linux functionality. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] ext3_fsblk_t: the rest of in-kernel filesystem blocks conversionMingming Cao
Convert the ext3 in-kernel filesystem blocks to ext3_fsblk_t. Convert the rest of all unsigned long type in-kernel filesystem blocks to ext3_fsblk_t, and replace the printk format string respondingly. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] ext3_fsblk_t: filesystem, group blocks and bug fixesMingming Cao
Some of the in-kernel ext3 block variable type are treated as signed 4 bytes int type, thus limited ext3 filesystem to 8TB (4kblock size based). While trying to fix them, it seems quite confusing in the ext3 code where some blocks are filesystem-wide blocks, some are group relative offsets that need to be signed value (as -1 has special meaning). So it seem saner to define two types of physical blocks: one is filesystem wide blocks, another is group-relative blocks. The following patches clarify these two types of blocks in the ext3 code, and fix the type bugs which limit current 32 bit ext3 filesystem limit to 8TB. With this series of patches and the percpu counter data type changes in the mm tree, we are able to extend exts filesystem limit to 16TB. This work is also a pre-request for the recent >32 bit ext3 work, and makes the kernel to able to address 48 bit ext3 block a lot easier: Simply redefine ext3_fsblk_t from unsigned long to sector_t and redefine the format string for ext3 filesystem block corresponding. Two RFC with a series patches have been posted to ext2-devel list and have been reviewed and discussed: http://marc.theaimsgroup.com/?l=ext2-devel&m=114722190816690&w=2 http://marc.theaimsgroup.com/?l=ext2-devel&m=114784919525942&w=2 Patches are tested on both 32 bit machine and 64 bit machine, <8TB ext3 and >8TB ext3 filesystem(with the latest to be released e2fsprogs-1.39). Tests includes overnight fsx, tiobench, dbench and fsstress. This patch: Defines ext3_fsblk_t and ext3_grpblk_t, and the printk format string for filesystem wide blocks. This patch classifies all block group relative blocks, and ext3_fsblk_t blocks occurs in the same function where used to be confusing before. Also include kernel bug fixes for filesystem wide in-kernel block variables. There are some fileystem wide blocks are treated as int/unsigned int type in the kernel currently, especially in ext3 block allocation and reservation code. This patch fixed those bugs by converting those variables to ext3_fsblk_t(unsigned long) type. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] Prepare for __copy_from_user_inatomic to not zero missed bytesNeilBrown
The problem is that when we write to a file, the copy from userspace to pagecache is first done with preemption disabled, so if the source address is not immediately available the copy fails *and* *zeros* *the* *destination*. This is a problem because a concurrent read (which admittedly is an odd thing to do) might see zeros rather that was there before the write, or what was there after, or some mixture of the two (any of these being a reasonable thing to see). If the copy did fail, it will immediately be retried with preemption re-enabled so any transient problem with accessing the source won't cause an error. The first copying does not need to zero any uncopied bytes, and doing so causes the problem. It uses copy_from_user_atomic rather than copy_from_user so the simple expedient is to change copy_from_user_atomic to *not* zero out bytes on failure. The first of these two patches prepares for the change by fixing two places which assume copy_from_user_atomic does zero the tail. The two usages are very similar pieces of code which copy from a userspace iovec into one or more page-cache pages. These are changed to remove the assumption. The second patch changes __copy_from_user_inatomic* to not zero the tail. Once these are accepted, I will look at similar patches of other architectures where this is important (ppc, mips and sparc being the ones I can find). This patch: There is a problem with __copy_from_user_inatomic zeroing the tail of the buffer in the case of an error. As it is called in atomic context, the error may be transient, so it results in zeros being written where maybe they shouldn't be. In the usage in filemap, this opens a window for a well timed read to see data (zeros) which is not consistent with any ordering of reads and writes. Most cases where __copy_from_user_inatomic is called, a failure results in __copy_from_user being called immediately. As long as the latter zeros the tail, the former doesn't need to. However in *copy_from_user_iovec implementations (in both filemap and ntfs/file), it is assumed that copy_from_user_inatomic will zero the tail. This patch removes that assumption, so that after this patch it will be safe for copy_from_user_inatomic to not zero the tail. This patch also adds some commentary to filemap.h and asm-i386/uaccess.h. After this patch, all architectures that might disable preempt when kmap_atomic is called need to have their __copy_from_user_inatomic* "fixed". This includes - powerpc - i386 - mips - sparc Signed-off-by: Neil Brown <neilb@suse.de> Cc: David Howells <dhowells@redhat.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] ext3: remove inconsistent space before exclamation point in mount codeTheodore Ts'o
This was reported as Debian bug #336604. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] ext3: fix memory leak when the journal file is corruptedTheodore Ts'o
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] ext2: clean up dead code from mount codeTheodore Ts'o
The variable i is guaranteed to be the same as db_count given the previous for loop. So get rid of it since it's dead code. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] Avoid disk sector_t overflow for >2TB ext3 filesystemMingming Cao
If ext3 filesystem is larger than 2TB, and sector_t is a u32 (i.e. CONFIG_LBD not defined in the kernel), the calculation of the disk sector will overflow. Add check at ext3_fill_super() and ext3_group_extend() to prevent mount/remount/resize >2TB ext3 filesystem if sector_t size is 4 bytes. Verified this patch on a 32 bit platform without CONFIG_LBD defined (sector_t is 32 bits long), mount refuse to mount a 10TB ext3. Signed-off-by: Mingming Cao<cmm@us.ibm.com> Acked-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] openpromfs: factorize outJan Engelhardt
"Move" "common code" out to PTR_NOD, which does the conversion from private pointer to node number. This is to reduce potential casting/conversion errors due to redundancy. (The naming PTR_NOD follows PTR_ERR, turning a pointer into xyz.) [akpm@osdl.org: cleanups] Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25[PATCH] openpromfs: remove unnecessary castsJan Engelhardt
Remove unnecessary casts in fs/openpromfs/inode.c Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>